Novel nucleic acid sequences encoding adenylate kinases, alcohol dehydrogenases, ubiquitin proteases, lipases, adenylate cyclases, and GTPase activators

ABSTRACT

The invention provides isolated nucleic acids molecules that encode novel polypeptides. The invention also provides antisense nucleic acid molecules, recombinant expression vectors containing the nucleic acid molecules of the invention, host cells into which the expression vectors have been introduced, and nonhuman transgenic animals in which a sequence of the invention has been introduced or disrupted. The invention still further provides isolated proteins, fusion proteins, antigenic peptides and antibodies. Diagnostic methods utilizing compositions of the invention are also provided.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application is a divisional of U.S. patent application Ser.No. 10/165,231, filed Jun. 6, 2002 (pending), which is acontinuation-in-part of U.S. patent application Ser. No. 09/390,038,filed Sep. 3, 1999 (abandoned). U.S. patent application Ser. No.10/165,231 is also a continuation-in-part of U.S. patent applicationSer. No. 09/796,089, filed Feb. 28, 2001 (abandoned), which is acontinuation-in-part of U.S. patent application Ser. No. 09/464,039,filed Dec. 15, 1999, now U.S. Pat. No. 7,094,565, and claims the benefitof International Application No. 371 PCT/US00/33873, filed Dec. 15,2000. U.S. patent application Ser. No. 10/165,231 is also acontinuation-in-part of U.S. patent application Ser. No. 09/972,525,filed Oct. 5, 2001 (abandoned), which is a divisional of U.S. patentapplication Ser. No. 09/408,865, filed Sep. 30, 1999, now U.S. Pat. No.6,329,171. U.S. patent application Ser. No. 10/165,231 is also acontinuation-in-part of Ser. No. 09/963,908, filed Sep. 26, 2001, now,U.S. Pat. No. 6,797,502, which is a divisional of U.S. patentapplication Ser. No. 09/434,613, filed Nov. 5, 1999, now U.S. Pat. No.6,337,187. U.S. patent application Ser. No. 10/165,231 is also acontinuation-in-part of U.S. patent application Ser. No. 09/461,076,filed Dec. 14, 1999 (abandoned). U.S. patent application Ser. No.10/165,231 is also a continuation-in-part of U.S. patent applicationSer. No. 09/802,127, filed Feb. 26, 2001 (abandoned), which claims thebenefit of U.S. Provisional Application Ser. No. 60/185,611, filed Feb.29, 2000 (abandoned). The entire contents of each of theabove-referenced patent applications are incorporated herein by thisreference.

FIELD OF THE INVENTION

The invention relates to novel nucleic acid sequences and polypeptides.Also provided are vectors, host cells, and recombinant methods formaking and using the novel molecules.

TABLE OF CONTENTS

-   Chapter 1 23552, A Novel Adenylate Kinase    -   i) SEQ ID NOS: 1-4    -   ii) FIGS. 1-6    -   iii) Continuation-In-Part of Ser. No. 09/390,038, filed Sep. 3,        1999-   Chapter 2 21612, 21615, 21620, 21676, 33756, Novel Human Alcohol    Dehydrogenases    -   i) SEQ ID NOS: 5-14    -   ii) FIGS. 7A-32    -   iii) Continuation-In-Part of Ser. No. 09/796,089, filed Feb. 28,        2001, which is a continuation-in-part of Ser. No. 09/464,039,        filed Dec. 15, 1999, and claims the benefit of 371        PCT/US00/33873, filed Dec. 15, 2000-   Chapter 3 23484, A Novel Human Ubiquitin Protease    -   i) SEQ ID NOS: 15-16    -   ii) FIGS. 33A-38    -   iii) Continuation-In-Part of Ser. No. 09/972,525, filed Oct. 5,        2001, which is a divisional of Ser. No. 09/408,865, filed Sep.        30, 1999, now U.S. Pat. No. 6,329,171-   Chapter 4 18891, A Novel Human Lipase    -   i) SEQ ID NOS: 17-18    -   ii) FIGS. 39A-45    -   iii) Continuation-In-Part of Ser. No. 09/963,908, filed Sep. 26,        2001, which is a divisional of Ser. No. 09/434,613, filed Nov.        5, 1999, now U.S. Pat. No. 6,337,187-   Chapter 5 25678, A Novel Human Adenylate Cyclase    -   i) SEQ ID NOS: 19-20    -   ii) FIGS. 46A-51    -   iii) Continuation-In-Part of Ser. No. 09/461,076, filed Dec. 14,        1999-   Chapter 6 Novel Human GTPase Activators    -   i) SEQ ID NOS: 21-30    -   ii) FIGS. 52A-64B    -   iii) Continuation-In-Part of Ser. No. 09/802,127, filed Feb. 26,        2001, which claims the benefit of U.S. Provisional 60/185,611,        filed Feb. 29, 2000

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 shows the amino acid sequence alignment for the protein (h23552;SEQ ID NO:2) encoded by human 23552 (SEQ ID NO:1) with the porcineUMP-CMP kinase (SP Accession Number Q29561; SEQ ID NO:3). The sequencealignment was generated using the Clustal method. The two sequencesshare approximately 97.4% identity over a 196 amino acid overlap asdetermined by pairwise alignment. Asterisks indicate identical residues.

FIG. 2 shows the 23552 nucleotide sequence (SEQ ID NO:1) and amino acids1 to 609 of the amino acid sequence set forth in SEQ ID NO:2.

FIG. 3 shows an analysis of the 23552 amino acid sequence: αβ turn andcoil regions; hydrophilicity; amphipathic regions; flexible regions;antigenic index; and surface probability plot.

FIG. 4 shows a 23552 receptor hydrophobicity plot and the 23552 aminoacid sequence (SEQ ID NO:2).

FIG. 5 shows an analysis of the 23552 open reading frame for amino acidscorresponding to specific functional sites. These sites are relevantwith regard to providing fragments of the 23552 nucleic acid or peptideas disclosed herein. The 23552 amino acid sequence contains anN-glycosylation site from amino acids 137 to 140 of SEQ ID NO:2; proteinkinase C phosphorylation sites at amino acids 21-23, 29-31, 170-172,190-192 of SEQ ID NO:2; casein kinase II phosphorylation sites at aminoacids 65-68 and 212-215 of SEQ ID NO:2; tyrosine kinase phosphorylationsites at amino acids 54-61 and 74-81 of SEQ ID NO:2; N-myristoylationsites at amino acids 42-47 and 49-54 of SEQ ID NO:2; and an adenylatekinase signature at amino acids 121-132 of SEQ ID NO:2.

FIG. 6 shows a comparison of the 23552 protein against the prositedatabase of protein patterns, specifically showing a high score againstan adenylate kinase consensus sequence set forth in SEQ ID NO:4.

FIGS. 7A-7B show the nucleotide sequence (SEQ ID NO:6) and the deducedamino acid sequence (SEQ ID NO:5) of the novel 21620 ADH.

FIG. 8 shows an analysis of the 21620 ADH amino acid sequence: αβturnand coil regions; hydrophilicity; amphipathic regions; flexible regions;antigenic index; and surface probability plot.

FIG. 9 shows a hydrophobicity plot of the 21620 ADH amino acid sequence(SEQ ID NO:5). Also shown is the predicted transmembrane segment fromabout amino acid 13 to about amino acid 32. In addition, a graphicalrepresentation of the functional domain of ADH short chain is alsoshown.

FIG. 10 shows an analysis of the 21620 ADH open reading frame for aminoacids (SEQ ID NO:5) corresponding to specific functional sites. Aputative protein kinase C phosphorylation site is found from about aminoacid 135 to about amino acid 137. Putative casein kinase IIphosphorylation sites are found from about amino acid 72 to about aminoacid 75, from about amino acid 89 to about amino acid 92, and from aboutamino acid 135 to about amino acid 138. Putative N-myristoylation sitesare found from about amino acid 18 to about amino acid 23, from aboutamino acid 24 to about amino acid 29, from about amino acid 40 to aboutamino acid 45, from about amino acid 90 to about amino acid 95, fromabout amino acid 109 to about amino acid 114, and from about amino acid199 to about amino acid 204. In addition, amino acids corresponding tothe short-chain alcohol dehydrogenase family signature are found in thesequence at about amino acids 166 to 176.

FIG. 11 shows expression of the 21620 ADH mRNA in various tissues. 21620expression levels were determined by quantitative PCR (Taqman® brandquantitative PCR kit, Applied Biosystems). The quantitative PCRreactions were performed according to the kit manufacturer'sinstructions.

FIG. 12 shows expression of the 21620 ADH mRNA in normal and malignantbreast, lung, liver and colon tissues. The liver metastases are derivedfrom malignant colonic tissue. 21620 expression levels were determinedby quantitative PCR (Taqman® brand quantitative PCR kit, AppliedBiosystems).

FIG. 13 shows the nucleotide sequence (SEQ ID NO:8) and the deducedamino acid sequence (SEQ ID NO:7) of the novel 33756 ADH.

FIG. 14 shows an analysis of the 33756 ADH amino acid sequence: αβturnand coil regions; hydrophilicity; amphipathic regions; flexible regions;antigenic index; and surface probability plot.

FIG. 15 shows a hydrophobicity plot of the 33756 ADH amino acid sequence(SEQ ID NO:7). Also shown is a graphical representation of thefunctional domain of ADH short chain.

FIG. 16 shows an analysis of the 33756 ADH open reading frame (SEQ IDNO:7) for amino acids corresponding to specific functional sites. Aputative N-glycosylation site is found from about amino acid 100 toabout amino acid 103. Putative protein kinase C phosphorylation sitesare found from about amino acid 29 to about amino acid 31, from aboutamino acid 32 to about amino acid 34, from about amino acid 120 to aboutamino acid 122, from about amino acid 144 to about amino acid 146, fromabout amino acid 213 to about amino acid 215, from about amino acid 242to about amino acid 244, and from about amino acid 252 to about aminoacid 254. Putative casein kinase II phosphorylation sites are found fromabout amino acid 32 to about amino acid 35, from about amino 63 to aboutamino acid 66, and from about amino acid 252 to about amino acid 255.Putative N-myristoylation sites are found from about amino acid 149 toabout amino acid 154, from about amino acid 160 to about amino acid 165,and from about amino acid 171 to about amino acid 176.

FIGS. 17A-17B show the nucleotide sequence (SEQ ID NO:9) and the deducedamino acid sequence (SEQ ID NO:10) of the novel 21676 ADH.

FIG. 18 shows an analysis of the 21676 ADH amino acid sequence: αβturnand coil regions; hydrophilicity; amphipathic regions; flexible regions;antigenic index; and surface probability plot.

FIG. 19 shows a hydrophobicity plot of the 21676 ADH amino acid sequence(SEQ ID NO:9). Also shown is the predicted amino terminus signal peptidesequence. In addition, two transmembrane segments are predicted for thefull-length polypeptide from about amino acid 8 to about amino acid 25and from about amino acid 242 to about amino acid 261. In the matureform of the polypeptide the transmembrane domain is predicted from aboutamino acid 226 to about amino acid 245. Also shown is a graphicalrepresentation of the functional domain of ADH short chain.

FIG. 20 shows an analysis of the 21676 ADH open reading frame (SEQ IDNO:9) for amino acids corresponding to specific functional sites. Aputative N-glycosylation site is found from about amino acid 171 toabout amino acid 174. A putative protein kinase C phosphorylation sitesare found from about amino acid 100 to about amino acid 102, from aboutamino acid 103 to about amino acid 105, from about amino acid 191 toabout amino acid 193, from about amino acid 215 to about amino acid 217,from about amino acid 284 to about amino acid 286, from about amino acid313 to about amino acid 315, and from about amino acid 323 to aboutamino acid 325. A putative casein kinase II phosphorylation sites arefound from about amino acid 54 to about amino acid 57, from about amino103 to about amino acid 106, from about amino acid 134 to about aminoacid 137, and from about amino acid 323 to about amino acid 326.Putative N-myristoylation sites are found from about amino acid 12 toabout amino acid 17, from about amino acid 28 to about amino acid 33,from about amino acid 45 to about amino acid 50, from about amino acid220 to about amino acid 225, from about amino acid 231 to about aminoacid 236, and from about amino acid 242 to about amino acid 247.

FIGS. 21A-21B show the nucleotide sequence (SEQ ID NO:12) and thededuced amino acid sequence (SEQ ID NO:11) of the novel 21612 ADH.

FIG. 22 shows an analysis of the 21612 ADH amino acid sequence: αβturnand coil regions; hydrophilicity; amphipathic regions; flexible regions;antigenic index; and surface probability plot.

FIG. 23 shows a hydrophobicity plot of the 21612 ADH amino acid sequence(SEQ ID NO:11). Also shown is a graphical representation of thefunctional domain of ADH short chain.

FIG. 24 shows an analysis of the 21612 ADH open reading frame (SEQ IDNO:11) for amino acids corresponding to specific functional sites. Aputative N-glycosylation site is found from about amino acid 101 toabout amino acid 104. A putative protein kinase C phosphorylation sitesare found from about amino acid 5 to about amino acid 7, from aboutamino acid 115 to about amino acid 117, from about amino acid 282 toabout amino acid 284, from about amino acid 313 to about amino acid 315,from about amino acid 381 to about amino acid 383, and from about aminoacid 392 to about amino acid 394. A putative casein kinase IIphosphorylation sites are found from about amino acid 56 to about aminoacid 59, from about amino 320 to about amino acid 323, from about aminoacid 338 to about amino acid 341, and from about amino acid 372 to aboutamino acid 375. A putative N-myristoylation sites are found from aboutamino acid 17 to about amino acid 22, from about amino acid 52 to aboutamino acid 57, from about amino acid 128 to about amino acid 133, andfrom about amino acid 353 to about amino acid 358. In addition, amicrobodies C-terminal targeting signal is found from about amino acid416 to about amino acid 418.

FIGS. 25A-25B show the nucleotide sequence (SEQ ID NO:14) and thededuced amino acid sequence (SEQ ID NO:13) of the novel 21615 ADH.

FIG. 26 shows an analysis of the 21615 ADH amino acid sequence: αβturnand coil regions; hydrophilicity; amphipathic regions; flexible regions;antigenic index; and surface probability plot.

FIG. 27 shows a hydrophobicity plot of the 21615 ADH amino acid sequence(SEQ ID NO:13). Also shown is the predicted transmembrane segment fromabout amino acid 8 to about amino acid 27. In addition, a graphicalrepresentation of the functional domain of ADH short chain is alsoshown.

FIG. 28 shows an analysis of the 21615 ADH open reading frame (SEQ IDNO:13) for amino acids corresponding to specific functional sites.Putative N-glycosylation sites are found from about amino acid 39 toabout amino acid 42 and from about amino acid 130 to about 133. Putativeprotein kinase C phosphorylation site are found from about amino acid 60to about amino acid 62, from about amino acid 137 to about amino acid139, from about amino acid 149 to about amino acid 151, and from aboutamino acid 208 to about amino acid 210. Putative casein kinase IIphosphorylation sites are found from about amino acid 89 to about aminoacid 92, from about amino 184 to about amino acid 187, from about aminoacid 213 to about amino acid 216. A putative tyrosine kinase site isfound from about amino acid 42 to about amino acid 49. PutativeN-myristoylation sites are found from about amino acid 17 to about aminoacid 22, from about amino acid 126 to about amino acid 131, from aboutamino acid 156 to about amino acid 161, and from about amino acid 169 toabout amino acid 174. In addition, a short-chain alcohol dehydrogenasefamily signature is found from about amino acid 147 to about amino acid157.

FIG. 29 shows a time course of levels of 21612 mRNA expression in thehuman colon cancer cell line HCT-116 following synchronization withnocodazole. 21612 mRNA levels rose steadily as the HCT-116 cellre-entered the cell cycle. 21612 expression levels were determined byquantitative PCR (Taqman® brand quantitative PCR kit, AppliedBiosystems). The quantitative PCR reactions were performed according tothe kit manufacturer's instructions.

FIG. 30 shows 21612 mRNA expression in a panel of normal and oncologicaltissues. 21612 expression is shown for 3 normal breast tissue samples(columns 1-3); 6 breast tumor samples (columns 4-9), including invasiveductal carcinomas (columns 4, 6, 7, and 8) and moderately differentiatedinvasive ductal carcinomas (column 5); 2 normal ovary tissue samples(columns 10 and 11); 5 ovarian cancer samples (columns 12-16); 3 normallung samples (columns 17-19); 7 lung cancer samples (columns 20-26),including small cell carcinoma (column 20), poorly differentiated nonsmall cell carcinoma of the lung (columns 21-23), adenocarcinoma (column25 and 26), 3 normal colon samples (columns 27-29); 4 colon cancersamples (columns 30-33), including moderately differentiated (columns 30and 31) and moderately to partially differentiated tumor (column 33);two colon cancer liver metastases samples (columns 34 and 35); onenormal liver sample (column 36); one hemanginoma sample (column 37), andtwo human pulmonary microvascular endothelial cell samples; onearresting (column 38) and one proliferating (column 39). 21612expression levels were determined by quantitative PCR (Taqman® brandquantitative PCR kit, Applied Biosystems).

FIG. 31 shows the levels of 21612 mRNA in samples from normal colon(columns 1-3), colon cancer tissue (columns 4-7), colon cancer livermetastases (8 and 9), and normal liver (column 10). 21612 expressionlevels were determined by quantitative PCR as described above. Note that21612 mRNA levels were significantly higher in 5 of 6 colon cancer andcolon cancer liver metastases samples, in comparison with the expressionlevels in normal colon samples.

FIG. 32 shows the levels of 21612 mRNA in the human colon cancer cellline HCT116 following cell cycle synchronization with nocodazole. Thefirst panel of this figure shows a time course the level of 21612 mRNAfollowing synchronization. Note the 21612 levels increase forapproximately 21 hours following removal of nocodazole. The second panelof this figure shows the time course of 21612 expression in a populationof HCT116 cells in the G₀, G₁, or S phase of the cell cycle. Cellpopulations containing cells in these phase of the cell cycle wereisolated by fluorescence-activated cell sorting. 21612 expressionintensity at each time point was determined by microarray hybridization.Note that the levels of 21612 mRNA in G₀/G₁/S phase HCT116 cellsincrease significantly in the first three hours following withdrawal ofnocodazole.

FIGS. 33A-33D show the nucleotide sequence (SEQ ID NO:16) and thededuced amino acid sequence (SEQ ID NO:15) of the novel ubiquitinprotease. The underlined amino acids designate the conserved cysteineregion and conserved histidine region. These regions are conserved amongthiol protease members of the UBP and UCH protein families.

FIG. 34 shows an analysis of the ubiquitin protease amino acid sequence:αβturn and coil regions; hydrophilicity; amphipathic regions; flexibleregions; antigenic index; and surface probability plot.

FIG. 35 shows a hydrophobicity plot of the ubiquitin protease (SEQ IDNO:15).

FIGS. 36A-36D show an analysis of the ubiquitin protease open readingframe for amino acids corresponding to specific functional sites of SEQID NO:15. Glycosylation sites are found from about amino acid 134 to137, with the modified amino acid at position 134; about amino acid 333to 336, with the modified amino acid at position 333; from about aminoacid 398 to 401 with the modified amino acid at position 398, from about492 to 495 with the modified amino acid at position 492, from about 560to 563 with the modified amino acid at position 560, from about 644 to647 with the modified amino acid at position 644, and from about 672 to675 with the modified amino acid at position 672. Cyclic AMP and cyclicGMP-dependent protein kinase phosphorylation sites are found from aboutamino acid 15 to 18 with the modified amino acid at position 18, fromabout amino acid 313 to 316 with the modified amino acid at position316, from about 607 to 610 with the modified amino acid at position 610;from about amino acid 694 to 697 with the modified amino acid atposition 697; from about amino acid 812 to 815 with the modified aminoacid at position 815. Protein kinase C phosphorylation sites are foundfrom about amino acid 31 to 33, with the modified amino acid at position31; from about amino acid 107 to 109, with the modified amino acid atposition 107; from about amino acid 111 to 113, with the modified aminoacid at position 111; from about amino acid 312 to 314, with themodified amino acid at position 312; from about amino acid 327 to 329,with the modified amino acid at position 327; from about amino acid 426to 428, with the modified amino acid at position 426; from about aminoacid 453 to 455, with the modified amino acid at position 453; fromabout amino acid 467 to 469, with the modified amino acid at position467; from about amino acid 475 to 477, with the modified amino acid atposition 475; from about amino acid 515 to 517, with the modified aminoacid at position 515; from about amino acid 546 to 548, with themodified amino acid at position 546; from about amino acid 561 to 563,with the modified amino acid at position 561; from about amino acid 556to 568, with the modified amino acid at position 566; from about aminoacid 582 to 584, with the modified amino acid at position 582; fromabout amino acid 623 to 625, with the modified amino acid at position623; from about amino acid 629 to 631, with the modified amino acid atposition 629; from about amino acid 662 to 664, with the modified aminoacid at position 662; from about 692 to 694, with the modified aminoacid at position 692; from about amino acid 748 to 750, with themodified amino acid at position 748; from about amino acid 765 to 767,with the modified amino acid at position 765; from about amino acid 809to 811, with the modified amino acid at position 809; from about aminoacid 865 to 867, with the modified amino acid at position 865; fromabout amino acid 911 to 913, with the modified amino acid at position911; from about amino acid 952 to 954, with the modified amino acid atposition 952; from about amino acid 965 to 967, with the modified aminoacid at position 965; from about amino acid 980 to 982, with themodified amino acid at position 980; from about amino acid 1034 to 1036,with the modified amino acid at position 1034; from about amino acid1103 to 1105, with the modified amino acid at position 1103; and fromabout amino acid 1120 to 1122, with the modified amino acid at 1120.Casein kinase II phosphorylation sites are found from about amino acid18 to 21; from amino acid 75 to 78; from amino acid 92 to 95; from aminoacid 260 to 263; from amino acid 481 to 484; from amino acid 527 to 530;from amino acid 613 to 616; from amino acid 656 to 659; from amino acid673 to 676; from amino acid 703 to 706; from amino acid 807 to 810; andfrom amino acid 1067 to 1070. Tyrosine kinase phosphorylation sites arefound from about amino acid 83 to 90, with the modified amino acid atposition 90; from about amino acid 338 to 345, with the modified aminoacid at position 345; and from about amino acid 1031 to 1038, with themodified amino acid at position 1038. N-myristoylation sites are foundfrom about amino acid from about 85 to 90, with the modified amino acidat position 85; from about amino acid 336 to 341, with the modifiedamino acid at position 336; from about amino acid 486 to 491, with themodified amino acid at position 486, from about amino acid 493 to 498,with the modified amino acid at position 493, from about amino acid 552to 557, with the modified amino acid at position 552; from amino acid570 to 575, with the modified amino acid at position 570; from aminoacid 595 to 600, with the modified amino acid at position 595; fromamino acid 609 to 614, with the modified amino acid at position 609; andfrom amino acid 898 to 903, with the modified amino acid at position898. Amidation sites are found from about amino acid 13 to 16, fromabout amino acid 467 to 470, from about amino acid 532 to 535; fromabout amino acid 841 to 844; and from about amino acids 1038 to 1041. Inaddition, an amino acid signature corresponding to the MHCimmunoglobulins and major histocompatibility complex proteins is foundfrom amino acids 376 to 382. The amino acids corresponding to the UCHfamily 2 signature are found at amino acids 365-383.

FIG. 37 shows expression of the protease in normal and malignant breast,lung, liver and colon tissues. The liver metastases are derived frommalignant colonic tissue.

FIG. 38 shows expression of the protease in various tissues and celltypes in culture. The expression data was derived from RT-PCR of variouscDNA libraries.

FIGS. 39A-39B show the nucleotide sequence (SEQ ID NO:18) and thededuced amino acid sequence (SEQ ID NO:17) of the novel lipase.

FIG. 40 shows an analysis of the lipase amino acid sequence: αβturn andcoil regions; hydrophilicity; amphipathic regions; flexible regions;antigenic index; and surface probability plot.

FIG. 41 shows a hydrophobicity plot of the lipase (SEQ ID NO:17). Theanalysis predicted a 35 amino acid signal peptide sequence at the aminoterminus of the protein. Transmembrane segments of both the full lengthlipase and the mature lipase are also shown.

FIG. 42 shows an analysis of the lipase open reading frame for aminoacids corresponding to specific functional sites of SEQ ID NO:17.Protein kinase C phosphorylation sites are found from about amino acid63 to 65; from about amino acid 111 to 113; from about amino acid 252 to254; from about amino acid 316 to 318. Casein kinase II phosphorylationsites are found from about amino acid 114 to 117; from amino acid 205 to208; from amino acid 284 to 287. N-myristoylation sites are found fromabout amino acid from about 13 to 18, with the modified amino acid atposition 13; from about amino acid 110 to 115, with the modified aminoacid at position 110; from about amino acid 146 to 151, with themodified amino acid at position 146, from about amino acid 155 to 160,with the modified amino acid at position 155, from about amino acid 175to 180, with the modified amino acid at position 175.

FIG. 43 shows expression of the lipase mRNA in various tissues and celltypes in culture. The expression data was derived from RT-PCR of variouscDNA libraries. The primers used were designed to amplify codingsequences.

FIG. 44 shows expression of the lipase mRNA in normal and malignantbreast, lung, liver and colon tissues. The liver metastases are derivedfrom malignant colonic tissue. The expression data was derived fromRT-PCR designed to amplify coding sequences.

FIG. 45 shows expression of the lipase mRNA in normal and malignantbreast, lung, liver and colon tissues. The liver metastases are derivedfrom malignant colonic tissue. The expression data was derived fromRT-PCR designed to amplify the untranslated region of the lipase.

FIGS. 46A-46D show the adenylate cyclase nucleotide sequence (SEQ IDNO:20) and the deduced amino acid sequence (SEQ ID NO:19).

FIG. 47 shows an analysis of the adenylate cyclase amino acid sequence:αβturn and coil regions; hydrophilicity; amphipathic regions; flexibleregions; antigenic index; and surface probability plot.

FIG. 48 shows a hydrophobicity plot of the adenylate cyclase.

FIGS. 49A-49C show an analysis of the adenylate cyclase open readingframe for amino acids corresponding to specific functional sites of SEQID NO:19. Glycosylation sites are shown in the figure with the actualmodified residue being the first amino acid. Protein kinase Cphosphorylation sites are shown in the figure with the actual modifiedresidue being the first amino acid. Casein kinase II phosphorylationsites are shown in the figure with the actual modified residue being thefirst amino acid. Tyrosine kinase phosphorylation sites are shown in thefigure with the actual modified residue being the last amino acid.N-myristoylation sites are shown in the figure, with the actual modifiedresidue being the first amino acid. In addition, amino acidscorresponding to the guanylate cyclase signature are found at aminoacids 394-417 and 1009-1032.

FIG. 50 shows expression of the 25678 adenylate cyclase in variousnormal human tissues.

FIG. 51 shows expression of the 25678 adenylate cyclase in variouscardiovascular tissues. Int. Mamm: internal mammary artery; CHF:congestive heart failure; ISCH: ischemic heart; myop: myopathic heart.

FIGS. 52A-52B show the 26651 nucleotide sequence (SEQ ID NO:21) and thededuced amino acid sequence (SEQ ID NO:22). The coding sequence for26651 is set forth in SEQ ID NO:23.

FIG. 53 shows a 26651 protein hydrophobicity plot. Relative hydrophobicresidues are shown above the dashed horizontal line, and relativehydrophilic residues are below the dashed horizontal line. The cysteineresidues (cys) and N glycosylation site (Ngly) are indicated by shortvertical lines just below the hydropathy trace. The numberscorresponding to the amino acid sequence (shown in SEQ ID NO:22) ofhuman 26651 are indicated. Polypeptides of the invention includefragments which include: all or a part of a hydrophobic sequence (asequence above the dashed line); or all or part of a hydrophilicfragment (a sequence below the dashed line). Other fragments include acysteine residue or as N-glycosylation site.

FIG. 54 shows an analysis of the 26651 amino acid sequence: αβturn andcoil regions; hydrophilicity; amphipathic regions; flexible regions;antigenic index; and surface probability plot.

FIG. 55 shows an analysis of the 26651 open reading frame for aminoacids corresponding to predicted functional sites. For the cAMP- andcGMP-dependent protein kinase phosphorylation site, the actual modifiedresidue is the last amino acid. For the protein kinase C phosphorylationsites, the actual modified residue is the first amino acid. For thecasein kinase II phosphorylation sites, the actual modified residue isthe first amino acid. For the tyrosine kinase phosphorylation site, theactual modified residue is the last amino acid.

FIGS. 56A-56B depict an alignment of the rho-GAP domain of human 26651with a consensus amino acid sequence derived from a hidden Markov model.The upper sequence is the consensus amino acid sequence (SEQ ID NO:27),while the lower amino acid sequence corresponds to amino acids 236 to397 of SEQ ID NO:22. The top half of the figure was obtained bysearching for complete domains using PFAM. In the lower portion of thefigure, a portion of human 26651 (amino acids 233 to 423 of SEQ IDNO:22) is aligned with a consensus rho-GAP_(—)3 domain (SEQ ID NO:28).The lower half of the figure was obtained by searching for completedomains in SMART.

FIGS. 57A-57C show the 26138 nucleotide sequence (SEQ ID NO:24) and thededuced 26138 amino acid sequence (SEQ ID NO:25). The coding sequencefor 26138 is set forth in SEQ ID NO:26.

FIG. 58 shows a 26138 protein hydrophobicity plot. Relative hydrophobicresidues are shown above the dashed horizontal line, and relativehydrophilic residues are below the dashed horizontal line. The cysteineresidues (cys) and N glycosylation site (Ngly) are indicated by shortvertical lines just below the hydropathy trace. The numberscorresponding to the amino acid sequence (shown in SEQ ID NO:25) ofhuman 26138 are indicated. Polypeptides of the invention includefragments which include: all or a part of a hydrophobic sequence (asequence above the dashed line); or all or part of a hydrophilicfragment (a sequence below the dashed line). Other fragments include acysteine residue or as N-glycosylation site.

FIG. 59 shows an analysis of the 26138 amino acid sequence: αβturn andcoil regions; hydrophilicity; amphipathic regions; flexible regions;antigenic index; and surface probability plot.

FIGS. 60A-60B show an analysis of the 26138 open reading frame for aminoacids corresponding to predicted functional sites. For theN-glycosylation site, the actual modified residue is the first aminoacid. For the N-myristoylation, the actual modified residue is the firstamino acid. For the cAMP- and cGMP-dependent protein kinasephosphorylation site, the actual modified residue is the first aminoacid. For the protein kinase C phosphorylation sites, the actualmodified residue is the first amino acid. For the casein kinase IIphosphorylation sites, the actual modified residue is the first aminoacid. In addition there is a Ras GTPase activating protein signature.

FIGS. 61A-61B depict an alignment of the ras-GAP domain of human 26138with a consensus amino acid sequence derived from a hidden Markov model.The upper sequence is the consensus amino acid sequence (SEQ ID NO:29),while the lower amino acid sequence corresponds to amino acids 473 to645 of SEQ ID NO:25. The top half of the figure shows the results of asearch for complete domains using PFAM for the 26138 protein. In thelower portion of the figure, a portion of human 26138 (amino acids 401to 723 of SEQ ID NO:25) is aligned with a consensus ras-GAP_(—)2 domain(SEQ ID NO:30). The lower half of the figure was obtained by searchingfor complete domains in SMART.

FIGS. 62A-62B show a PSORT prediction of protein localization for the26138 GAP protein.

FIG. 63 shows chromosome mapping information for the 26138 GAP gene.

FIGS. 64A-64B depict expression of 26138 in various human tissues andcell types: lung (column 1); kidney (column 2); brain (column 3); heart(column 4); colon (column 5); tonsil (column 6); spleen (column 7);fetal liver (column 8); pooled liver samples (column 9); stellate cellstreated with 1,25-dihydroxyvitamin D3 (column 10); serum reactivatedstellate cells (column 11); NHLF-CTN (column 12); NHLF-TGF, normal humanlung fibroblasts treated with TGF-β (column 13); hepG2 CTN (column 14);hepG2 TGF, hepG2 cells treated with TGF-β (column 15); LF NDR 190,fibrotic liver (column 16); LF NDR 191, fibrotic liver (column 17); LFNDR 194, fibrotic liver (column 18); LF NDR 113 (column 19); Th1 48 hrM4 (column 20); Th1 48 hr M5 (column 21); Th2 48 hr M5 (column 22);granulocytes (column 23); CD19+ cells (column 24); CD14+ cells (column25); PBMC mock, peripheral blood mononuclear cells (column 26); PBMCPHA, PBMC treated with phytohaemagglutinin (column 27); PBMC IL10, PBMCproducing IL10 (column 28); PBMC 1113 (column 29); NHBE mock, normalhuman bronchial epithelial cells (column 30); NHBE IL13-1 (column 31);BM-MNC, bone marrow mononuclear cells (column 32); mPB CD34, CD34+ cellsfrom mobilized peripheral blood (column 33); ABM CD34+, CD34+ cells fromadult bone marrow (column 34); erythroid cells (column 35);megakaryocytes (column 36); neutrophils (column 37); mBM CD11b+, CD11b+cells from human mobilized bone marrow (column 38); mBM CD15+, CD15+mobilized human bone marrow (column 39); mBM CD11b-, CD11b-cells fromhuman mobilized bone marrow (column 40); BM/GPA, GPA+ cells from humanbone marrow (column 41); BM CD71, CD71 positive bone marrow cells(column 42); HepG2A (column 43); HepG2 2.21-a (column 44); and notemplate control (column 45). Expression levels were determined byquantitative RT-PCR (Taqman® brand quantitative PCR kit, AppliedBiosystems). The quantitative RT-PCR reactions were performed accordingto the kit manufacturer's instructions.

CHAPTER 1 23552, A Novel Adenylate Kinase BACKGROUND OF THE INVENTION

Adenylate kinases play a key role in the regulation of energy balancewithin cells, particularly maintenance of the ratio of ATP with itsdiphosphate (ADP) and monophosphate forms (AMP). ATP serves as theprimary source of energy for biochemical reactions in cells and is alsoa key precursor in DNA and RNA synthesis during cellular growth andreplication. The energy associated with the terminal phosphate bonds ofATP may be transferred to other nucleotides using a nucleosidemonophosphate kinase such as adenlyate kinase. In this manner, theterminal energy-rich phosphate bonds of ATP may be transferred to theappropriate nucleotides for use in a variety of biosynthetic andenergy-requiring processes, such as biosynthesis of macromolecules,active ion transport, muscle contraction, thermogenesis, etc. A numberof these energy-requiring biosynthetic reactions hydrolyze ATP into AMPplus pyrophosphate. Reutilization of the resulting AMP requiresconversion back into the triphosphate form following conversion to ADP.Various nucleotide monophosphate kinases carry out the first step ofphosphorylating AMP to its diphosphate form at the expense of ATP. Inthe case of adenylate kinase, this reversible reaction is given asAMP+ATP≡2 ADP.

Adenylate kinases also play a role in regulating the flow of carbonbetween net accumulation of glucose via the gluconeogenesis pathway andits subsequent catabolism via the glycolytic pathway by way of theircontrol over the ratio of AMP to ATP. AMP is a positive allostericeffector of the enzyme 6-phophofructo-1-kinase, which shifts, and anegative allosteric effector for the enzyme fructose-1,6-bisphosphatase.When the first of these enzymes is activated, carbon flow is shifted inthe direction of glycolysis; when the second of these enzymes isactivated, carbon flow shifts in the direction of gluconeogenesis. Thus,increases in the ratio of AMP to ATP shift carbon flow towardglycolysis, while decreases in the ratio of AMP to ATP shift carbon flowtoward glucose formation.

These enzymes have been studied in a number of mammals, including rat,porcine, chicken, bovine, rabbit, and humans. Evidence from biochemicalstudies suggests that human tissues have five adenylate kinase isozymes,AK1-AK5. Thus far the cDNAs of human AK1, AK2, AK4, and AK5 have beencloned. Adenylate kinase isoforms in humans are sequence related andalso related to UMP/CMP kinases from several species. See Rompay et al.(1999) Eur. J. Biochem. 261:509-516, and the references cited therein.

The adenylate kinase isozymes AK1 (or myokinase), which is a cytosolicenzyme present in brain, skeletal muscle, and erythrocytes, and AK2,which is associated with the mitochondrial membrane in liver, spleen,heart, and kidney, both utilize ATP as their nucleoside triphosphatedonor substrate. AK3 (or GTP:AMP phosphotransferase) is located in themitochondrial matrix, primarily in heart and liver cells, and uses MgGTPinstead of MgATP. AK4 and AK5 are both localized in brain tissue.

Several regions of AK family enzymes are well conserved, including thenucleoside triphosphate binding glycine-rich region, the nucleosidemonophosphate binding site, and the lid domain that closes over thesubstrate upon binding (see Schulz (1987) Cold Spring Harbor Symp.Quant. Biol. 52:429-439).

These enzymes assist with maintenance of energy production andutilization within cells, particularly in cells having high rates ofgrowth and metabolic activity such as in heart, skeletal muscle, andliver. In fact, adenylate kinase deficiency has been linked to hemolyticanemia and neurological disorders such as neurofibromatosis (Xu et al.(1992) Genomics 13:537-542. In addition, targeting regulation of ATPsynthesis has been the basis of antiproliferative drugs for treatment ofviral infections and cancer.

Adenylate kinases are also useful for activating nucleoside analoguesused as pharmaceuticals, especially for cancer and viral infection. Mostof these analogues must be phosphorylated to the triphosphate form inorder to be pharmaceutically active. The first phosphorylation step inthe activation of nucleoside analogs is catalyzed by deoxyribonucleosidekinases. Phosphorylation to the di- and triphosphates are then required.

Accordingly, adenylate kinases are a major target for drug action anddevelopment. Thus, it is valuable to the field of pharmaceuticaldevelopment to identify and characterize previously unknown adenylatekinases. The present invention advances the state of the art byproviding a previously unidentified human adenylate kinase.

SUMMARY OF THE INVENTION

Isolated nucleic acid molecules corresponding to adenylate kinasenucleic acid sequences are provided. Additionally, amino acid sequencescorresponding to the polynucleotides are encompassed. In particular, thepresent invention provides for isolated nucleic acid moleculescomprising nucleotide sequences encoding the amino acid sequence shownin SEQ ID NO:2 or the nucleotide sequences encoding the DNA sequencedeposited in a bacterial host with the Patent Depository of the AmericanType Culture Collection (ATCC) as Patent Deposit Number PTA-1850.Further provided are adenylate kinase polypeptides having an amino acidsequence encoded by a nucleic acid molecule described herein.

The present invention also provides vectors and host cells forrecombinant expression of the nucleic acid molecules described herein,as well as methods of making such vectors and host cells and for usingthem for production of the polypeptides or peptides of the invention byrecombinant techniques.

The adenylate kinase molecules of the present invention are useful formodulating cellular growth and/or cellular metabolic pathwaysparticularly for regulating one or more proteins involved in growth andmetabolism. Accordingly, in one aspect, this invention provides isolatednucleic acid molecules encoding adenylate kinase proteins orbiologically active portions thereof, as well as nucleic acid fragmentssuitable as primers or hybridization probes for the detection ofadenylate kinase-encoding nucleic acids.

Another aspect of this invention features isolated or recombinantadenylate kinase proteins and polypeptides. Preferred adenylate kinaseproteins and polypeptides possess at least one biological activitypossessed by naturally occurring adenylate kinase proteins.

Variant nucleic acid molecules and polypeptides substantially homologousto the nucleotide and amino acid sequences set forth in the sequencelistings are encompassed by the present invention. Additionally,fragments and substantially homologous fragments of the nucleotide andamino acid sequences are provided.

Antibodies and antibody fragments that selectively bind the adenylatekinase polypeptides and fragments are provided. Such antibodies areuseful in detecting the adenylate kinase polypeptides as well as inregulating the T-cell immune response and cellular activity,particularly growth and proliferation.

In another aspect, the present invention provides a method for detectingthe presence of adenylate kinase activity or expression in a biologicalsample by contacting the biological sample with an agent capable ofdetecting an indicator of adenylate kinase activity such that thepresence of adenylate kinase activity is detected in the biologicalsample.

In yet another aspect, the invention provides a method for modulatingadenylate kinase activity comprising contacting a cell with an agentthat modulates (inhibits or stimulates) adenylate kinase activity orexpression such that adenylate kinase activity or expression in the cellis modulated. In one embodiment, the agent is an antibody thatspecifically binds to adenylate kinase protein. In another embodiment,the agent modulates expression of adenylate kinase protein by modulatingtranscription of an adenylate kinase gene, splicing of an adenylatekinase mRNA, or translation of an adenylate kinase mRNA. In yet anotherembodiment, the agent is a nucleic acid molecule having a nucleotidesequence that is antisense to the coding strand of the adenylate kinasemRNA or the adenylate kinase gene.

In one embodiment, the methods of the present invention are used totreat a subject having a disorder characterized by aberrant adenylatekinase protein activity or nucleic acid expression by administering anagent that is an adenylate kinase modulator to the subject. In oneembodiment, the adenylate kinase modulator is an adenylate kinaseprotein. In another embodiment, the adenylate kinase modulator is anadenylate kinase nucleic acid molecule. In other embodiments, theadenylate kinase modulator is a peptide, peptidomimetic, or other smallmolecule.

The present invention also provides a diagnostic assay for identifyingthe presence or absence of a genetic lesion or mutation characterized byat least one of the following: (1) aberrant modification or mutation ofa gene encoding an adenylate kinase protein; (2) misregulation of a geneencoding an adenylate kinase protein; and (3) aberrantpost-translational modification of an adenylate kinase protein, whereina wild-type form of the gene encodes a protein with an adenylate kinaseactivity.

In another aspect, the invention provides a method for identifying acompound that binds to or modulates the activity of an adenylate kinaseprotein. In general, such methods entail measuring a biological activityof an adenylate kinase protein in the presence and absence of a testcompound and identifying those compounds that alter the activity of theadenylate kinase protein.

The invention also features methods for identifying a compound thatmodulates the expression of adenylate kinase genes by measuring theexpression of the adenylate kinase sequences in the presence and absenceof the compound.

Other features and advantages of the invention will be apparent from thefollowing detailed description and claims.

DETAILED DESCRIPTION OF THE INVENTION

The present inventions now will be described more fully hereinafter withreference to the accompanying drawings, in which some, but not allembodiments of the invention are shown. Indeed, these inventions may beembodied in many different forms and should not be construed as limitedto the embodiments set forth herein; rather, these embodiments areprovided so that this disclosure will satisfy applicable legalrequirements. Like numbers refer to like elements throughout.

Many modifications and other embodiments of the inventions set forthherein will come to mind to one skilled in the art to which theseinventions pertain having the benefit of the teachings presented in theforegoing descriptions and the associated drawings. Therefore, it is tobe understood that the inventions are not to be limited to the specificembodiments disclosed and that modifications and other embodiments areintended to be included within the scope of the appended claims.Although specific terms are employed herein, they are used in a genericand descriptive sense only and not for purposes of limitation.

The present invention is based, at least in part, on the identificationof novel molecules, referred to herein as adenylate kinase nucleic acidand polypeptide molecules, which play a role in, or function in,numerous biochemical pathways associated with cellular growth and/orcellular metabolic activity. These growth and metabolic pathways aredescribed in Lodish et al. (1995) Molecular Cell Biology (ScientificAmerican Books Inc., New York, N.Y.) and Devlin (1997) Textbook ofBiochemistry with Clinical Correlations (Wiley-Liss, Inc., New York,N.Y.), the contents of which are incorporated herein by reference. Inone embodiment, the adenylate kinase molecules modulate the activity ofone or more proteins involved in cellular growth or differentiation,e.g., cardiac, epithelial, or neuronal cell growth or differentiation.In another embodiment, the adenylate kinase molecules of the presentinvention are capable of modulating the phosphorylation state of anucleoside mono-, di-, or triphosphate molecule or the phosphorylationstate of one or more proteins involved in cellular growth ordifferentiation, e.g., cardiac, epithelial, or neuronal cell growth ordifferentiation, as described in, for example, Lodish et al. (1995) andDevlin (1997), supra. In addition, the substrates of the adenylatekinases of the present invention are targets of drugs described inGoodman and Gilman (1996), The Pharmacological Basis of Therapeutics(9^(th) ed.) Hartman & Limbard Editors, the contents of which areincorporated herein by reference. Particularly, the adenylate kinases ofthe invention may modulate phosphorylation activity in tissues and cellsincluding lymph node, spleen, thymus, brain, lung, skeletal muscle,fetal liver, tonsil, colon, heart, liver, immune cells, including Tcells, Th1 and Th2 cells, leukocytes, blood marrow, etc. In oneembodiment, the adenylate kinase sequences of the invention are used tomanipulate the nucleoside mono-, di-, and triphosphate pool to altercellular metabolic pathways, such as glycolysis and gluconeogenesis.

Adenylate kinases play an important role in the regulation of energybalance within cells and in energy-requiring biochemical processesassociated with cellular growth and development. Inhibition orover-stimulation of the activity of adenylate kinases affects thecellular equilibrium between nucleoside mono-, di-, and triphosphates,particularly AMP, ADP, and ATP, all of which are integrally involved inenergy-requiring biochemical processes associated with cellular growthand development. Disruption or modulation of this equilibrium can leadto perturbed cellular growth, which can in turn lead to cellular growthrelated-disorders. As used herein, a “cellular growth-related disorder”includes a disorder, disease, or condition characterized by aderegulation, e.g., an upregulation or a downregulation, of cellulargrowth. Cellular growth deregulation may be due to a deregulation ofcellular proliferation, cell cycle progression, cellular differentiationand/or cellular hypertrophy. Examples of cellular growth relateddisorders include cardiovascular disorders such as heart failure,hypertension, atrial fibrillation, dilated cardiomyopathy, idiopathiccardiomyopathy, or angina; proliferative disorders or differentiativedisorders such as cancer, e.g., melanoma, prostate cancer, cervicalcancer, breast cancer, colon cancer, or sarcoma. Disorders associatedwith the following cells or tissues are also encompassed: lymph node,spleen, thymus, brain, lung, skeletal muscle, fetal liver, tonsil,colon, heart, liver, immune cells, including T cells, Th1 and Th2 cells,leukocytes, blood marrow, etc. The compositions are also useful for thetreatment of liver fibrosis and other liver-related disorders.

Furthermore, adenylate kinase activity increases in cerebrospinal fluidat the acute onset of ischemic brain damage and is correlated with theseverity of the lesion (Buttner et al. (1986) J. Neurol. 233:297-303).Adenyl kinase activity also increases in cerebrous spinal fluid in somebrain tumors (Ronquist et al. (1977) Lancet i: 1284-1286). Further,adenyl kinase may be expressed in damaged tissue and therefore is auseful target to measure tissue damage. Finally, deletions at 1p31 locusin many tumors is associated with hemolytic anemia (Matsuura et al.(1989) J. Biol. Chem. 264:10148-10155 and Mitelman et al. (1997) NatureGenet. 15:417-474). Accordingly, the compositions are also useful fortreatment and diagnosis related to these disorders.

The disclosed invention relates to methods and compositions for themodulation, diagnosis, and treatment of immune, inflammatory,respiratory, and hematological disorders.

Immune disorders include, but are not limited to, chronic inflammatorydiseases and disorders, such as Crohn's disease, reactive arthritis,including Lyme disease, insulin-dependent diabetes, organ-specificautoimmunity, including multiple sclerosis, Hashimoto's thyroiditis andGrave's disease, contact dermatitis, psoriasis, graft rejection, graftversus host disease, sarcoidosis, atopic conditions, such as asthma andallergy, including allergic rhinitis, gastrointestinal allergies,including food allergies, eosinophilia, conjunctivitis, glomerularnephritis, certain pathogen susceptibilities such as helminthic (e.g.,leishmaniasis), certain viral infections, including HIV, and bacterialinfections, including tuberculosis and lepromatous leprosy.

Respiratory disorders include, but are not limited to, apnea, asthma,particularly bronchial asthma, berillium disease, bronchiectasis,bronchitis, bronchopneumonia, cystic fibrosis, diphtheria, dyspnea,emphysema, chronic obstructive pulmonary disease, allergicbronchopulmonary aspergillosis, pneumonia, acute pulmonary edema,pertussis, pharyngitis, atelectasis, Wegener's granulomatosis,Legionnaires disease, pleurisy, rheumatic fever, and sinusitis.

Hematologic disorders include but are not limited to anemias includingsickle cell and hemolytic anemia, hemophilias including types A and B,leukemias, thalassemias, spherocytosis, Von Willebrand disease, chronicgranulomatous disease, glucose-6-phosphate dehydrogenase deficiency,thrombosis, clotting factor abnormalities and deficiencies includingfactor VM and IX deficiencies, hemarthrosis, hematemesis, hematomas,hematuria, hemochromatosis, hemoglobinuria, hemolytic-uremic syndrome,thrombocytopenias including HIV-associated thrombocytopenia, hemorrhagictelangiectasia, idiopathic thrombocytopenic purpura, thromboticmicroangiopathy, hemosiderosis.

Liver disorders include, but are not limited to, hepatic injury;jaundice and cholestasis, such as bilirubin and bile formation; hepaticfailure and cirrhosis, such as cirrhosis, portal hypertension, includingascites, portosystemic shunts, and splenomegaly; infectious disorders,including, but not limited to, infectious hepatitis, such as viralhepatitis, including hepatitis A-E infection and infection by otherhepatitis viruses, clinicopathologic syndromes, such as the carrierstate, asymptomatic infection, acute viral hepatitis, chronic viralhepatitis, and fulminant hepatitis; autoimmune hepatitis; other forms ofhepatitis; drug- and toxin-induced liver disease, such as alcoholicliver disease; inborn errors of metabolism and pediatric liver disease,such as hemochromatosis, Wilson disease, α₁-antitrypsin deficiency, andneonatal hepatitis; intrahepatic biliary tract disease, such assecondary biliary cirrhosis, primary biliary cirrhosis, primarysclerosing cholangitis, and anomalies of the biliary tree; circulatorydisorders, such as impaired blood flow into the liver, including hepaticartery compromise and portal vein obstruction and thrombosis, impairedblood flow through the liver, including passive congestion andcentrilobular necrosis and peliosis hepatis, hepatic vein outflowobstruction, including hepatic vein thrombosis (Budd-Chiari syndrome)and veno-occlusive disease; hepatic disease associated with pregnancy,such as preeclampsia and eclampsia, acute fatty liver of pregnancy, andintrehepatic cholestasis of pregnancy; hepatic complications of organ orbone marrow transplantation, such as drug toxicity after bone marrowtransplantation, graft-versus-host disease and liver rejection, andnonimmunologic damage to liver allografts; tumors and tumorousconditions, such as nodular hyperplasias, adenomas, and malignanttumors, including hepatocellular carcinoma, primary carcinoma of theliver and metastatic tumors.

Preferred disorders include, but are not limited to hepatitis, andespecially viral hepatitis and hepatocellular carcinoma.

The disclosed invention also relates to methods and compositions for themodulation, diagnosis, and treatment of disorders involving the brain,heart, lung, colon, and spleen.

Disorders involving the brain include, but are limited to, disordersinvolving neurons, and disorders involving glia, such as astrocytes,oligodendrocytes, ependymal cells, and microglia; cerebral edema, raisedintracranial pressure and herniation, and hydrocephalus; malformationsand developmental diseases, such as neural tube defects, forebrainanomalies, posterior fossa anomalies, and syringomyelia and hydromyelia;perinatal brain injury; cerebrovascular diseases, such as those relatedto hypoxia, ischemia, and infarction, including hypotension,hypoperfusion, and low-flow states—global cerebral ischemia and focalcerebral ischemia—infarction from obstruction of local blood supply,intracranial hemorrhage, including intracerebral (intraparenchymal)hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, andvascular malformations, hypertensive cerebrovascular disease, includinglacunar infarcts, slit hemorrhages, and hypertensive encephalopathy;infections, such as acute meningitis, including acute pyogenic(bacterial) meningitis and acute aseptic (viral) meningitis, acute focalsuppurative infections, including brain abscess, subdural empyema, andextradural abscess, chronic bacterial meningoencephalitis, includingtuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis(Lyme disease), viral meningoencephalitis, including arthropod-borne(Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplexvirus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus,poliomyelitis, rabies, and human immunodeficiency virus 1, includingHIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy,AIDS-associated myopathy, peripheral neuropathy, and AIDS in children,progressive multifocal leukoencephalopathy, subacute sclerosingpanencephalitis, fungal meningoencephalitis, other infectious diseasesof the nervous system; transmissible spongiform encephalopathies (priondiseases); demyelinating diseases, including multiple sclerosis,multiple sclerosis variants, acute disseminated encephalomyclitis andacute necrotizing hemorrhagic encephalomyclitis, and other diseases withdemyclination; degenerative diseases, such as degenerative diseasesaffecting the cerebral cortex, including Alzheimer disease and Pickdisease, degenerative diseases of basal ganglia and brain stem,including Parkinsonism, idiopathic Parkinson disease (paralysisagitans), progressive supranuclear palsy, corticobasal degenration,multiple system atrophy, including striatonigral degenration, Shy-Dragersyndrome, and olivopontocerebellar atrophy, and Huntington disease;spinocerebellar degenerations, including spinocerebellar ataxias,including Friedreich ataxia, and ataxia-telanglectasia, degenerativediseases affecting motor neurons, including amyotrophic lateralsclerosis (motor neuron disease), bulbospinal atrophy (Kennedysyndrome), and spinal muscular atrophy; inborn errors of metabolism,such as leukodystrophies, including Krabbe disease, metachromaticleukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, andCanavan disease, mitochondrial encephalomyopathies, including Leighdisease and other mitochondrial encephalomyopathies; toxic and acquiredmetabolic diseases, including vitamin deficiencies such as thiamine(vitamin B₁) deficiency and vitamin B₁₂ deficiency, neurologic sequelaeof metabolic disturbances, including hypoglycemia, hyperglycemia, andhepatic encephatopathy, toxic disorders, including carbon monoxide,methanol, ethanol, and radiation, including combined methotrexate andradiation-induced injury; tumors, such as gliomas, includingastrocytoma, including fibrillary (diffuse) astrocytoma and glioblastomamultiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, andbrain stem glioma, oligodendroglioma, and ependymoma and relatedparaventricular mass lesions, neuronal tumors, poorly differentiatedneoplasms, including medulloblastoma, other parenchymal tumors,including primary brain lymphoma, germ cell tumors, and pinealparenchymal tumors, meningiomas, metastatic tumors, paraneoplasticsyndromes, peripheral nerve sheath tumors, including schwannoma,neurofibroma, and malignant peripheral nerve sheath tumor (malignantschwannoma), and neurocutaneous syndromes (phakomatoses), includingneurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindaudisease.

Disorders involving the heart include, but are not limited to, heartfailure, including but not limited to, cardiac hypertrophy, left-sidedheart failure, and right-sided heart failure; ischemic heart disease,including but not limited to angina pectoris, myocardial infarction,chronic ischemic heart disease, and sudden cardiac death; hypertensiveheart disease, including but not limited to, systemic (left-sided)hypertensive heart disease and pulmonary (right-sided) hypertensiveheart disease; valvular heart disease, including but not limited to,valvular degeneration caused by calcification, such as calcific aorticstenosis, calcification of a congenitally bicuspid aortic valve, andmitral annular calcification, and myxomatous degeneration of the mitralvalve (mitral valve prolapse), rheumatic fever and rheumatic heartdisease, infective endocarditis, and noninfected vegetations, such asnonbacterial thrombotic endocarditis and endocarditis of systemic lupuserythematosus (Libman-Sacks disease), carcinoid heart disease, andcomplications of artificial valves; myocardial disease, including butnot limited to dilated cardiomyopathy, hypertrophic cardiomyopathy,restrictive cardiomyopathy, and myocarditis; pericardial disease,including but not limited to, pericardial effusion and hemopericardiumand pericarditis, including acute pericarditis and healed pericarditis,and rheumatoid heart disease; neoplastic heart disease, including butnot limited to, primary cardiac tumors, such as myxoma, lipoma,papillary fibroelastoma, rhabdomyoma, and sarcoma, and cardiac effectsof noncardiac neoplasms; congenital heart disease, including but notlimited to, left-to-right shunts—late cyanosis, such as atrial septaldefect, ventricular septal defect, patent ductus arteriosus, andatrioventricular septal defect, right-to-left shunts—early cyanosis,such as tetralogy of fallot, transposition of great arteries, truncusarteriosus, tricuspid atresia, and total anomalous pulmonary venousconnection, obstructive congenital anomalies, such as coarctation ofaorta, pulmonary stenosis and atresia, and aortic stenosis and atresia,and disorders involving cardiac transplantation.

Disorders involving the lung include, but are not limited to, congenitalanomalies; atelectasis; diseases of vascular origin, such as pulmonarycongestion and edema, including hemodynamic pulmonary edema and edemacaused by microvascular injury, adult respiratory distress syndrome(diffuse alveolar damage), pulmonary embolism, hemorrhage, andinfarction, and pulmonary hypertension and vascular sclerosis; chronicobstructive pulmonary disease, such as emphysema, chronic bronchitis,bronchial asthma, and bronchiectasis; diffuse interstitial(infiltrative, restrictive) diseases, such as pneumoconioses,sarcoidosis, idiopathic pulmonary fibrosis, desquamative interstitialpneumonitis, hypersensitivity pneumonitis, pulmonary eosinophilia(pulmonary infiltration with eosinophilia), Bronchiolitisobliterans-organizing pneumonia, diffuse pulmonary hemorrhage syndromes,including Goodpasture syndrome, idiopathic pulmonary hemosiderosis andother hemorrhagic syndromes, pulmonary involvement in collagen vasculardisorders, and pulmonary alveolar proteinosis; complications oftherapies, such as drug-induced lung disease, radiation-induced lungdisease, and lung transplantation; tumors, such as bronchogeniccarcinoma, including paraneoplastic syndromes, bronchioloalveolarcarcinoma, neuroendocrine tumors, such as bronchial carcinoid,miscellaneous tumors, and metastatic tumors; pathologies of the pleura,including inflammatory pleural effusions, noninflammatory pleuraleffusions, pneumothorax, and pleural tumors, including solitary fibroustumors (pleural fibroma) and malignant mesothelioma.

Disorders involving the colon include, but are not limited to,congenital anomalies, such as atresia and stenosis, Meckel diverticulum,congenital aganglionic megacolon-Hirschsprung disease; enterocolitis,such as diarrhea and dysentery, infectious enterocolitis, includingviral gastroenteritis, bacterial enterocolitis, necrotizingenterocolitis, antibiotic-associated colitis (pseudomembranous colitis),and collagenous and lymphocytic colitis, miscellaneous intestinalinflammatory disorders, including parasites and protozoa, acquiredimmunodeficiency syndrome, transplantation, drug-induced intestinalinjury, radiation enterocolitis, neutropenic colitis (typhlitis), anddiversion colitis; idiopathic inflammatory bowel disease, such as Crohndisease and ulcerative colitis; tumors of the colon, such asnon-neoplastic polyps, adenomas, familial syndromes, colorectalcarcinogenesis, colorectal carcinoma, and carcinoid tumors.

Disorders of the spleen include, but are not limited to, splenomegaly,including nonspecific acute splenitis, congestive spenomegaly, andspenic infarcts; neoplasms, congenital anomalies, and rupture. Disordersassociated with splenomegaly include infections, such as nonspecificsplenitis, infectious mononucleosis, tuberculosis, typhoid fever,brucellosis, cytomegalovirus, syphilis, malaria, histoplasmosis,toxoplasmosis, kala-azar, trypanosomiasis, schistosomiasis,leishmaniasis, and echinococcosis; congestive states related to partialhypertension, such as cirrhosis of the liver, portal or splenic veinthrombosis, and cardiac failure; lymphohematogenous disorders, such asHodgkin disease, non-Hodgkin lymphomas/leukemia, multiple myeloma,myeloproliferative disorders, hemolytic anemias, and thrombocytopenicpurpura; immunologic-inflammatory conditions, such as rheumatoidarthritis and systemic lupus erythematosus; storage diseases such asGaucher disease, Niemann-Pick disease, and mucopolysaccharidoses; andother conditions, such as amyloidosis, primary neoplasms and cysts, andsecondary neoplasms.

Disorders involving the thymus include developmental disorders, such asDiGeorge syndrome with thymic hypoplasia or aplasia; thymic cysts;thymic hypoplasia, which involves the appearance of lymphoid follicleswithin the thymus, creating thymic follicular hyperplasia; and thymomas,including germ cell tumors, lymphomas, Hodgkin disease, and carcinoids.Thymomas can include benign or encapsulated thymoma, and malignantthymoma Type I (invasive thymoma) or Type II, designated thymiccarcinoma.

Disorders involving the skeletal muscle include tumors such asrhabdomyosarcoma.

Disorders involving T-cells include, but are not limited to,cell-mediated hypersensitivity, such as delayed type hypersensitivityand T-cell-mediated cytotoxicity, and transplant rejection; autoimmunediseases, such as systemic lupus erythematosus, Sjögren syndrome,systemic sclerosis, inflammatory myopathies, mixed connective tissuedisease, and polyarteritis nodosa and other vasculitides; immunologicdeficiency syndromes, including but not limited to, primaryimmunodeficiencies, such as thymic hypoplasia, severe combinedimmunodeficiency diseases, and AIDS; leukopenia; reactive (inflammatory)proliferations of white cells, including but not limited to,leukocytosis, acute nonspecific lymphadenitis, and chronic nonspecificlymphadenitis; neoplastic proliferations of white cells, including butnot limited to lymphoid neoplasms, such as precursor T-cell neoplasms,such as acute lymphoblastic leukemia/lymphoma, peripheral T-cell andnatural killer cell neoplasms that include peripheral T-cell lymphoma,unspecified, adult T-cell leukemia/lymphoma, mycosis fungoides andSezary syndrome, and Hodgkin disease.

In normal bone marrow, the myelocytic series (polymorphoneuclear cells)make up approximately 60% of the cellular elements, and the erythrocyticseries, 20-30%. Lymphocytes, monocytes, reticular cells, plasma cellsand megakaryocytes together constitute 10-20%. Lymphocytes make up 5-15%of normal adult marrow. In the bone marrow, cell types are add mixed sothat precursors of red blood cells (erythroblasts), macrophages(monoblasts), platelets (megakaryocytes), polymorphoneuclear leucocytes(myeloblasts), and lymphocytes (lymphoblasts) can be visible in onemicroscopic field. In addition, stem cells exist for the different celllineages, as well as a precursor stem cell for the committed progenitorcells of the different lineages. The various types of cells and stagesof each would be known to the person of ordinary skill in the art andare found, for example, on page 42 (FIG. 2-8) of Immunology,Immunopathology and Immunity, Fifth Edition, Sell et al. Simon andSchuster (1996), incorporated by reference for its teaching of celltypes found in the bone marrow. According, the invention is directed todisorders arising from these cells. These disorders include but are notlimited to the following: diseases involving hematopoeitic stem cells;committed lymphoid progenitor cells; lymphoid cells including B andT-cells; committed myeloid progenitors, including monocytes,granulocytes, and megakaryocytes; and committed erythroid progenitors.These include but are not limited to the leukemias, including B-lymphoidleukemias, T-lymphoid leukemias, undifferentiated leukemias;erythroleukemia, megakaryoblastic leukemia, monocytic; [leukemias areencompassed with and without differentiation]; chronic and acutelymphoblastic leukemia, chronic and acute lymphocytic leukemia, chronicand acute myelogenous leukemia, lymphoma, myelo dysplastic syndrome,chronic and acute myeloid leukemia, myelomonocytic leukemia; chronic andacute myeloblastic leukemia, chronic and acute myelogenous leukemia,chronic and acute promyelocytic leukemia, chronic and acute myelocyticleukemia, hematologic malignancies of monocyte-macrophage lineage, suchas juvenile chronic myelogenous leukemia; secondary AML, antecedenthematological disorder; refractory anemia; aplastic anemia; reactivecutaneous angioendotheliomatosis; fibrosing disorders involving alteredexpression in dendritic cells, disorders including systemic sclerosis,E-M syndrome, epidemic toxic oil syndrome, eosinophilic fasciitislocalized forms of scleroderma, keloid, and fibrosing colonopathy;angiomatoid malignant fibrous histiocytoma; carcinoma, including primaryhead and neck squamous cell carcinoma; sarcoma, including kaposi'ssarcoma; fibroadanoma and phyllodes tumors, including mammaryfibroadenoma; stromal tumors; phyllodes tumors, including histiocytoma;erythroblastosis; neurofibromatosis; diseases of the vascularendothelium; demyelinating, particularly in old lesions; gliosis,vasogenic edema, vascular disease, Alzheimer's and Parkinson's disease;T-cell lymphomas; B-cell lymphomas.

Disorders involving red cells include, but are not limited to, anemias,such as hemolytic anemias, including hereditary spherocytosis, hemolyticdisease due to erythrocyte enzyme defects: glucose-6-phosphatedehydrogenase deficiency, sickle cell disease, thalassemia syndromes,paroxysmal nocturnal hemoglobinuria, immunohemolytic anemia, andhemolytic anemia resulting from trauma to red cells; and anemias ofdiminished erythropoiesis, including megaloblastic anemias, such asanemias of vitamin B12 deficiency: pernicious anemia, and anemia offolate deficiency, iron deficiency anemia, anemia of chronic disease,aplastic anemia, pure red cell aplasia, and other forms of marrowfailure.

Disorders involving B-cells include, but are not limited to precursorB-cell neoplasms, such as lymphoblastic leukemia/lymphoma. PeripheralB-cell neoplasms include, but are not limited to, chronic lymphocyticleukemia/small lymphocytic lymphoma, follicular lymphoma, diffuse largeB-cell lymphoma, Burkitt lymphoma, plasma cell neoplasms, multiplemyeloma, and related entities, lymphoplasmacytic lymphoma (Waldenströmmacroglobulinemia), mantle cell lymphoma, marginal zone lymphoma(MALToma), and hairy cell leukemia.

Disorders related to reduced platelet number, thrombocytopenia, includeidiopathic thrombocytopenic purpura, including acute idiopathicthrombocytopenic purpura, drug-induced thrombocytopenia, HIV-associatedthrombocytopenia, and thrombotic microangiopathies: thromboticthrombocytopenic purpura and hemolytic-uremic syndrome.

Disorders involving precursor T-cell neoplasms include precursor Tlymphoblastic leukemia/lymphoma. Disorders involving peripheral T-celland natural killer cell neoplasms include T-cell chronic lymphocyticleukemia, large granular lymphocytic leukemia, mycosis fungoides andSezary syndrome, peripheral T-cell lymphoma, unspecified,angioimmunoblastic T-cell lymphoma, angiocentric lymphoma (NK/T-celllymphoma^(4a)), intestinal T-cell lymphoma, adult T-cellleukemia/lymphoma, and anaplastic large cell lymphoma.

Specifically, the present invention provides isolated nucleic acidmolecules comprising nucleotide sequences encoding the adenylate kinasepolypeptide whose amino acid sequence is given in SEQ ID NO:2, or avariant or fragment of the polypeptides. A nucleotide sequence encodingan adenylate kinase polypeptide of the invention, more particularly thepolypeptide of SEQ ID NO:2, is set forth in SEQ ID NO:1.

A novel human gene, termed clone h23552 is provided. This sequence, andcomplements thereof, are referred to as “adenylate kinase” indicatingthat the gene sequences share sequence similarity to adenylate kinasegenes.

The novel h23552 adenylate kinase gene encodes an approximately 1.43 KbmRNA transcript having the corresponding cDNA set forth in SEQ ID NO:1.This transcript has a 634 nucleotide open reading frame (nucleotides200-883 of SEQ ID NO:1), which encodes a 228 amino acid protein (SEQ IDNO:2). An analysis of the full-length h23552 polypeptide predicts thatthe N-terminal 47 amino acids may represent a region comprising a signalpeptide. Prosite program analysis was used to predict various siteswithin the h23552 protein. An N-glycosylation site was predicted at aa137-140. Protein kinase C phosphorylation sites were predicted at aa21-23, 29-31, 170-172, and 190-192. Casein kinase II phosphorylationsites were predicted at aa 65-68 and 212-215. Tyrosine kinasephosphorylation sites were predicted at aa 54-61 and 74-81.N-myristoylation sites were predicted at aa 42-47 and 49-54. Anadenylate kinase signature sequence was predicted at aa 121-132.

The h23552 adenylate kinase protein possesses an adenylate kinase domainsequence, from aa 40-203, as predicted by HMMer, Version 2. This regionof the protein comprises the three functional subdomains common tonucleoside monophosphate kinases: the nucleoside triphosphate bindingglycine-rich region, the nucleoside monophosphate binding site, and thelid domain that closes over the substrate upon binding (see Schulz(1987) Cold Spring Harbor Symp. Quant. Biol. 52:429-439).

The h23552 protein displays closest similarity to the porcine UMP-CMPkinase (SP Accession Number Q29561; SEQ ID NO:3), approximately 97.4%identity when aa 33-228 are aligned over the full-length sequence forthe porcine kinase (see FIG. 1) The N-terminal region of the h23552protein (aa 1-32) is novel. Alignment of the h23552 protein with theporcine UMP-CMP kinase indicates that the glycine-rich regioncorresponding to the binding site of the nucleoside triphosphate donorresides at approximately amino acid residues 42-50 of the h23552protein. The region corresponding to the nucleoside monophosphatebinding site resides at approximately amino acid residues 65-95; and theregion corresponding to the lid domain resides at approximately aminoacid residues 166-175. The similarity of the novel h23552 protein to theporcine UMP-CMP kinase indicates the h23552 adenlyate kinase is a memberof the subclass of nucleoside monophosphate kinases referred to as“short enzymes”. Members of this subclass, which are characterized bytheir short-length lid domain, include adenlyate kinase 1 (AK1,identified in rabbit, bovine, human, pig, and chicken), adenylate kinase5 (AK5 identified in human), and UMP-CMP kinases (identified in porcine,Dictyostelium discoideum, Saccharomyces cereviseae). See Rompay et al.(1999) Eur. J. Biochem. 261:509-516 and Fukami-Kobayashi et al. (1996)FEBS Lett. 385:214-220.

A plasmid containing the h23552 cDNA insert was deposited with thePatent Depository of the American Type Culture Collection (ATCC), 10801University Boulevard, Manassas, Va., on May 29, 2000, and assignedPatent Deposit Number PTA 1850. This deposit will be maintained underthe terms of the Budapest Treaty on the International Recognition of theDeposit of Microorganisms for the Purposes of Patent Procedure. Thisdeposit was made merely as a convenience for those of skill in the artand is not an admission that a deposit is required under 35 U.S.C. §112.

The adenylate kinase sequences of the invention are members of a familyof molecules having conserved functional features. The term “family”when referring to the proteins and nucleic acid molecules of theinvention is intended to mean two or more proteins or nucleic acidmolecules having sufficient amino acid or nucleotide sequence identityas defined herein. Such family members can be naturally occurring andcan be from either the same or different species. For example, a familycan contain a first protein of murine origin and a homologue of thatprotein of human origin, as well as a second, distinct protein of humanorigin and a murine homologue of that protein. Members of a family mayalso have common functional characteristics.

Preferred adenylate kinase polypeptides of the present invention have anamino acid sequence sufficiently identical to the amino acid sequence ofSEQ ID NO:2. The term “sufficiently identical” is used herein to referto a first amino acid or nucleotide sequence that contains a sufficientor minimum number of identical or equivalent (e.g., with a similar sidechain) amino acid residues or nucleotides to a second amino acid ornucleotide sequence such that the first and second amino acid ornucleotide sequences have a common structural domain and/or commonfunctional activity. For example, amino acid or nucleotide sequencesthat contain a common structural domain having at least about 45%, 55%,or 65% identity, preferably 75% identity, more preferably 85%, 95%, or98% identity are defined herein as sufficiently identical.

To determine the percent identity of two amino acid sequences or of twonucleic acids, the sequences are aligned for optimal comparison purposes(e.g., gaps can be introduced in the sequence of a first amino acid ornucleic acid sequence for optimal alignment with a second amino ornucleic acid sequence). The amino acid residues or nucleotides atcorresponding amino acid positions or nucleotide positions are thencompared. When a position in the first sequence is occupied by the sameamino acid residue or nucleotide as the corresponding position in thesecond sequence, then the molecules are identical at that position. Thepercent identity between the two sequences is a function of the numberof identical positions shared by the sequences (i.e., % identity=# ofidentical positions/total # of positions (e.g., overlappingpositions)×100). In one embodiment, the two sequences are the samelength. The percent identity between two sequences can be determinedusing techniques similar to those described below, with or withoutallowing gaps. In calculating percent identity, typically exact matchesare counted.

The determination of percent identity between two sequences can beaccomplished using a mathematical algorithm. A preferred, non-limitingexample of a mathematical algorithm utilized for the comparison of twosequences is the algorithm of Karlin and Altschul (1990) Proc. Natl.Acad. Sci. USA 87:2264-2268, modified as in Karlin and Altschul (1993)Proc. Natl. Acad. Sci. USA 90:5873-5877. Such an algorithm isincorporated into the NBLAST and XBLAST programs of Altschul, et al.(1990) J. Mol. Biol. 215:403-410. BLAST nucleotide searches can beperformed with the NBLAST program, score=100, wordlength=12 to obtainnucleotide sequences homologous to adenylate kinase nucleic acidmolecules of the invention. BLAST protein searches can be performed withthe XBLAST program, score=50, wordlength=3 to obtain amino acidsequences homologous to adenylate kinase protein molecules of theinvention. To obtain gapped alignments for comparison purposes, GappedBLAST can be utilized as described in Altschul et al. (1997) NucleicAcids Res. 25:3389-3402. Alternatively, PSI-Blast can be used to performan iterated search which detects distant relationships between molecules(Id.). When utilizing BLAST, Gapped BLAST, and PSI-Blast programs, thedefault parameters of the respective programs (e.g., XBLAST and NBLAST)can be used. See www.ncbi.nlm.nih.gov.

Another preferred, non-limiting example of a mathematical algorithmutilized for the comparison of sequences is the algorithm of Myers andMiller, CABIOS (1989). Such an algorithm is incorporated into the ALIGNprogram (version 2.0) which is part of the CGC sequence alignmentsoftware package. When utilizing the ALIGN program for comparing aminoacid sequences, a PAM120 weight residue table, a gap length penalty of12, and a gap penalty of 4 can be used. Additional algorithms forsequence analysis are known in the art and include ADVANCE and ADAM asdescribed in Torellis and Robotti (1994) Comput. Appl. Biosci., 10:3-5;and FASTA described in Pearson and Lipman (1988) Proc. Natl. Acad. Sci.85:2444-8. Within FASTA, ktup is a control option that sets thesensitivity and speed of the search. If ktup=2, similar regions in thetwo sequences being compared are found by looking at pairs of alignedresidues; if ktup=1, single aligned amino acids are examined. ktup canbe set to 2 or 1 for protein sequences, or from 1 to 6 for DNAsequences. The default if ktup is not specified is 2 for proteins and 6for DNA. For a further description of FASTA parameters, seebioweb.pasteur.fr/docs/man/man/fasta.1.html#sect2, the contents of whichare incorporated herein by reference.

The percent identity between two sequences can be determined usingtechniques similar to those described above, with or without allowinggaps. In calculating percent identity, typically exact matches arecounted.

Accordingly, another embodiment of the invention features isolatedadenylate kinase proteins and polypeptides having an adenylate kinaseprotein activity. As used interchangeably herein, a “adenylate kinaseprotein activity”, “biological activity of an adenylate kinase protein”,or “functional activity of an adenylate kinase protein” refers to anactivity exerted by an adenylate kinase protein, polypeptide, or nucleicacid molecule on an adenylate kinase responsive cell as determined invivo, or in vitro, according to standard assay techniques. An adenylatekinase activity can be a direct activity, such as an association with oran enzymatic activity on a second protein, or an indirect activity, suchas a cellular activity mediated by interaction of the adenylate kinaseprotein with a second protein. In a preferred embodiment, an adenylatekinase activity includes at least one or more of the followingactivities: (1) modulating (stimulating and/or enhancing or inhibiting)cellular proliferation, differentiation, and/or function, particularlyin cells in which the sequences are expressed, for example, cells of thelymph node, spleen, thymus, brain, lung, skeletal muscle, fetal liver,tonsil, colon, heart, liver, and immune cells, including Th1, Th2, Tcells, natural killer T cells, lymphocytes, leukocytes, blood marrow,etc.); (2) modulating a target cell's energy balance, particularly theratio between AMP and ATP; (3) modulating the glycolytic pathway; (4)modulating the gluconeogenesis pathway; (4) modulating cell growth; (5)modulating the entry of cells into mitosis; (6) modulating cellulardifferentiation; (7) modulating cell death; and (8) modulating an immuneresponse.

An “isolated” or “purified” adenylate kinase nucleic acid molecule orprotein, or biologically active portion thereof, is substantially freeof other cellular material, or culture medium when produced byrecombinant techniques, or substantially free of chemical precursors orother chemicals when chemically synthesized. Preferably, an “isolated”nucleic acid is free of sequences (preferably protein encodingsequences) that naturally flank the nucleic acid (i.e., sequenceslocated at the 5′ and 3′ ends of the nucleic acid) in the genomic DNA ofthe organism from which the nucleic acid is derived. For purposes of theinvention, “isolated” when used to refer to nucleic acid moleculesexcludes isolated chromosomes. For example, in various embodiments, theisolated adenylate kinase nucleic acid molecule can contain less thanabout 5 kb, 4 kb, 3 kb, 2 kb, 1 kb, 0.5 kb, or 0.1 kb of nucleotidesequences that naturally flank the nucleic acid molecule in genomic DNAof the cell from which the nucleic acid is derived. An adenylate kinaseprotein that is substantially free of cellular material includespreparations of adenylate kinase protein having less than about 30%,20%, 10%, or 5% (by dry weight) of non-adenylate kinase protein (alsoreferred to herein as a “contaminating protein”). When the adenylatekinase protein or biologically active portion thereof is recombinantlyproduced, preferably, culture medium represents less than about 30%,20%, 10%, or 5% of the volume of the protein preparation. When adenylatekinase protein is produced by chemical synthesis, preferably the proteinpreparations have less than about 30%, 20%, 10%, or 5% (by dry weight)of chemical precursors or non-adenylate kinase chemicals.

Various aspects of the invention are described in further detail in thefollowing subsections.

I. Isolated Nucleic Acid Molecules

One aspect of the invention pertains to isolated nucleic acid moleculescomprising nucleotide sequences encoding adenylate kinase proteins andpolypeptides or biologically active portions thereof, as well as nucleicacid molecules sufficient for use as hybridization probes to identifyadenylate kinase-encoding nucleic acids (e.g., adenylate kinase mRNA)and fragments for use as PCR primers for the amplification or mutationof adenylate kinase nucleic acid molecules. As used herein, the term“nucleic acid molecule” is intended to include DNA molecules (e.g., cDNAor genomic DNA) and RNA molecules (e.g., mRNA) and analogs of the DNA orRNA generated using nucleotide analogs. The nucleic acid molecule can besingle-stranded or double-stranded, but preferably is double-strandedDNA.

Nucleotide sequences encoding the adenylate kinase proteins of thepresent invention include sequences set forth in SEQ ID NO:1, thenucleotide sequence of the cDNA insert of the plasmid deposited with theATCC as Patent Deposit Number PTA-1850 (the “cDNA of Patent DepositNumber PTA-1850”), and complements thereof. By “complement” is intendeda nucleotide sequence that is sufficiently complementary to a givennucleotide sequence such that it can hybridize to the given nucleotidesequence to thereby form a stable duplex. The corresponding amino acidsequence for the adenylate kinase protein encoded by these nucleotidesequences is set forth in SEQ ID NO:2.

Nucleic acid molecules that are fragments of these adenylate kinasenucleotide sequences are also encompassed by the present invention. By“fragment” is intended a portion of the nucleotide sequence encoding anadenylate kinase protein. A fragment of an adenylate kinase nucleotidesequence may encode a biologically active portion of an adenylate kinaseprotein, or it may be a fragment that can be used as a hybridizationprobe or PCR primer using methods disclosed below. A biologically activeportion of an adenylate kinase protein can be prepared by isolating aportion of one of the adenylate kinase nucleotide sequences of theinvention, expressing the encoded portion of the adenylate kinaseprotein (e.g., by recombinant expression in vitro), and assessing theactivity of the encoded portion of the adenylate kinase protein. Nucleicacid molecules that are fragments of an adenylate kinase nucleotidesequence comprise at least 15, 20, 50, 75, 100, 200, 300, 350, 400, 450,500, 550, 600, 650, 700, 750, 800, 850, 900, 950, 1000, 1050, 1100,1150, 1200, 1250, 1300, 1350, 1400 nucleotides, or up to the number ofnucleotides present in a full-length adenylate kinase nucleotidesequence disclosed herein (for example, 1434 nucleotides for SEQ IDNO:1) depending upon the intended use.

It is understood that isolated fragments include any contiguous sequencenot disclosed prior to the invention as well as sequences that aresubstantially the same and which are not disclosed. Accordingly, if anisolated fragment is disclosed prior to the present invention, thatfragment is not intended to be encompassed by the invention. When asequence is not disclosed prior to the present invention, an isolatednucleic acid fragment is at least about 12, 15, 20, 25, or 30 contiguousnucleotides. Other regions of the nucleotide sequence may comprisefragments of various sizes, depending upon potential homology withpreviously disclosed sequences.

For example, when considering the full-length, 1434 nucleotidetranscript set forth in SEQ ID NO:1, the nucleotide sequence from aboutnucleotide (nt) 1 to about nt 200 encompasses isolated fragments greaterthan about 13, 15, or 20 nucleotides; the nucleotide sequence from aboutnt 200 to about nt 1034 encompasses isolated fragments greater thanabout 102, 105, or 110 nucleotides; the nucleotide sequence from aboutnt 1034 to about nt 1434 encompasses isolated fragments greater thanabout 24, 25, or 28 nucleotides. The nucleotide sequence correspondingto the open reading frame (nt 200-883 of SEQ ID NO:1) encompassesisolated fragments greater than about 102, 105, or 110 nucleotides.

A fragment of an adenylate kinase nucleotide sequence that encodes abiologically active portion of an adenylate kinase protein of theinvention will encode at least 15, 25, 30, 50, 75, 100, 125, 150, 175,200, or 225 contiguous amino acids, or up to the total number of aminoacids present in a full-length adenylate kinase protein of the invention(for example, 228 amino acids for SEQ ID NO:2). Fragments of anadenylate kinase nucleotide sequence that are useful as hybridizationprobes for PCR primers generally need not encode a biologically activeportion of an adenylate kinase protein.

Nucleic acid molecules that are variants of the adenylate kinasenucleotide sequences disclosed herein are also encompassed by thepresent invention. “Variants” of the adenylate kinase nucleotidesequences include those sequences that encode the adenylate kinaseproteins disclosed herein but that differ conservatively because of thedegeneracy of the genetic code. These naturally occurring allelicvariants can be identified with the use of well-known molecular biologytechniques, such as polymerase chain reaction (PCR) and hybridizationtechniques as outlined below. Variant nucleotide sequences also includesynthetically derived nucleotide sequences that have been generated, forexample, by using site-directed mutagenesis but which still encode theadenylate kinase proteins disclosed in the present invention asdiscussed below. Generally, nucleotide sequence variants of theinvention will have at least 45%, 55%, 65%, 75%, 85%, 95%, or 98%identity to a particular nucleotide sequence disclosed herein. A variantadenylate kinase nucleotide sequence will encode an adenylate kinaseprotein that has an amino acid sequence having at least 45%, 55%, 65%,75%, 85%, 95%, or 98% identity to the amino acid sequence of anadenylate kinase protein disclosed herein.

In addition to the adenylate kinase nucleotide sequences shown in SEQ IDNOs:1 and 3, and the nucleotide sequence of the cDNA of Patent DepositNumber PTA-1850, it will be appreciated by those skilled in the art thatDNA sequence polymorphisms that lead to changes in the amino acidsequences of adenylate kinase proteins may exist within a population(e.g., the human population). Such genetic polymorphism in an adenylatekinase gene may exist among individuals within a population due tonatural allelic variation. An allele is one of a group of genes thatoccur alternatively at a given genetic locus. As used herein, the terms“gene” and “recombinant gene” refer to nucleic acid molecules comprisingan open reading frame encoding an adenylate kinase protein, preferably amammalian adenylate kinase protein. As used herein, the phrase “allelicvariant” refers to a nucleotide sequence that occurs at an adenylatekinase locus or to a polypeptide encoded by the nucleotide sequence.Such natural allelic variations can typically result in 1-5% variance inthe nucleotide sequence of the adenylate kinase gene. Any and all suchnucleotide variations and resulting amino acid polymorphisms orvariations in an adenylate kinase sequence that are the result ofnatural allelic variation and that do not alter the functional activityof adenylate kinase proteins are intended to be within the scope of theinvention.

Moreover, nucleic acid molecules encoding adenylate kinase proteins fromother species (adenylate kinase homologues), which have a nucleotidesequence differing from that of the adenylate kinase sequences disclosedherein, are intended to be within the scope of the invention. Forexample, nucleic acid molecules corresponding to natural allelicvariants and homologues of the human adenylate kinase cDNA of theinvention can be isolated based on their identity to the human adenylatekinase nucleic acid disclosed herein using the human cDNA, or a portionthereof, as a hybridization probe according to standard hybridizationtechniques under stringent hybridization conditions as disclosed below.

In addition to naturally-occurring allelic variants of the adenylatekinase sequences that may exist in the population, the skilled artisanwill further appreciate that changes can be introduced by mutation intothe nucleotide sequences of the invention thereby leading to changes inthe amino acid sequence of the encoded adenylate kinase proteins,without altering the biological activity of the adenylate kinaseproteins. Thus, an isolated nucleic acid molecule encoding an adenylatekinase protein having a sequence that differs from that of SEQ ID NO:2can be created by introducing one or more nucleotide substitutions,additions, or deletions into the corresponding nucleotide sequencedisclosed herein, such that one or more amino acid substitutions,additions or deletions are introduced into the encoded protein.Mutations can be introduced by standard techniques, such assite-directed mutagenesis and PCR-mediated mutagenesis. Such variantnucleotide sequences are also encompassed by the present invention.

For example, preferably, conservative amino acid substitutions may bemade at one or more predicted, preferably nonessential amino acidresidues. A “nonessential” amino acid residue is a residue that can bealtered from the wild-type sequence of an adenylate kinase protein(e.g., the sequence of SEQ ID NO:2) without altering the biologicalactivity, whereas an “essential” amino acid residue is required forbiological activity. A “conservative amino acid substitution” is one inwhich the amino acid residue is replaced with an amino acid residuehaving a similar side chain. Families of amino acid residues havingsimilar side chains have been defined in the art. These families includeamino acids with basic side chains (e.g., lysine, arginine, histidine),acidic side chains (e.g., aspartic acid, glutamic acid), uncharged polarside chains (e.g., glycine, asparagine, glutamine, serine, threonine,tyrosine, cysteine), nonpolar side chains (e.g., alanine, valine,leucine, isoleucine, proline, phenylalanine, methionine, tryptophan),beta-branched side chains (e.g., threonine, valine, isoleucine) andaromatic side chains (e.g., tyrosine, phenylalanine, tryptophan,histidine). Such substitutions would not be made for conserved aminoacid residues, or for amino acid residues residing within a conservedmotif, such as the adenylate kinase domain sequence of SEQ ID NO:2(amino acid residues 40-203), where such residues are essential forprotein activity.

Alternatively, variant adenylate kinase nucleotide sequences can be madeby introducing mutations randomly along all or part of an adenylatekinase coding sequence, such as by saturation mutagenesis, and theresultant mutants can be screened for adenylate kinase biologicalactivity to identify mutants that retain activity. Followingmutagenesis, the encoded protein can be expressed recombinantly, and theactivity of the protein can be determined using standard assaytechniques.

Thus the nucleotide sequences of the invention include the sequencesdisclosed herein as well as fragments and variants thereof. Theadenylate kinase nucleotide sequences of the invention, and fragmentsand variants thereof, can be used as probes and/or primers to identifyand/or clone adenylate kinase homologues in other cell types, e.g., fromother tissues, as well as adenylate kinase homologues from othermammals. Such probes can be used to detect transcripts or genomicsequences encoding the same or identical proteins. These probes can beused as part of a diagnostic test kit for identifying cells or tissuesthat misexpress an adenylate kinase protein, such as by measuring levelsof an adenylate kinase-encoding nucleic acid in a sample of cells from asubject, e.g., detecting adenylate kinase mRNA levels or determiningwhether a genomic adenylate kinase gene has been mutated or deleted.

In this manner, methods such as PCR, hybridization, and the like can beused to identify such sequences having substantial identity to thesequences of the invention. See, for example, Sambrook et al. (1989)Molecular Cloning: Laboratory Manual (2d ed., Cold Spring HarborLaboratory Press, Plainview, N.Y.) and Innis, et al. (1990) PCRProtocols: A Guide to Methods and Applications (Academic Press, NY).Adenylate kinase nucleotide sequences isolated based on their sequenceidentity to the adenylate kinase nucleotide sequences set forth hereinor to fragments and variants thereof are encompassed by the presentinvention.

In a hybridization method, all or part of a known adenylate kinasenucleotide sequence can be used to screen cDNA or genomic libraries.Methods for construction of such cDNA and genomic libraries aregenerally known in the art and are disclosed in Sambrook et al. (1989)Molecular Cloning: A Laboratory Manual (2d ed., Cold Spring HarborLaboratory Press, Plainview, N.Y.). The so-called hybridization probesmay be genomic DNA fragments, cDNA fragments, RNA fragments, or otheroligonucleotides, and may be labeled with a detectable group such as³²P, or any other detectable marker, such as other radioisotopes, afluorescent compound, an enzyme, or an enzyme co-factor. Probes forhybridization can be made by labeling synthetic oligonucleotides basedon the known adenylate kinase nucleotide sequence disclosed herein.Degenerate primers designed on the basis of conserved nucleotides oramino acid residues in a known adenylate kinase nucleotide sequence orencoded amino acid sequence can additionally be used. The probetypically comprises a region of nucleotide sequence that hybridizesunder stringent conditions to at least about 12, preferably about 25,more preferably about 50, 75, 100, 125, 150, 175, 200, 250, 300, 350, or400 consecutive nucleotides of an adenylate kinase nucleotide sequenceof the invention or a fragment or variant thereof. Preparation of Probesfor Hybridization is Generally Known in the Art and is Disclosed inSambrook et al. (1989) Molecular Cloning: A Laboratory Manual (2d ed.,Cold Spring Harbor Laboratory Press, Plainview, N.Y.), hereinincorporated by reference.

For example, in one embodiment, a previously unidentified adenylatekinase nucleic acid molecule hybridizes under stringent conditions to aprobe that is a nucleic acid molecule comprising one of the adenylatekinase nucleotide sequences of the invention or a fragment thereof. Inanother embodiment, the previously unknown adenylate kinase nucleic acidmolecule is at least 300, 325, 350, 375, 400, 425, 450, 500, 550, 600,650, 700, 800, 900, 1000, 2,000, 3,000, 4,000 or 5,000 nucleotides inlength and hybridizes under stringent conditions to a probe that is anucleic acid molecule comprising one of the adenylate kinase nucleotidesequences disclosed herein or a fragment thereof.

Accordingly, in another embodiment, an isolated previously unknownadenylate kinase nucleic acid molecule of the invention is at least 300,325, 350, 375, 400, 425, 450, 500, 550, 600, 650, 700, 800, 900, 1000,1,100, 1,200, 1,300, or 1,400 nucleotides in length and hybridizes understringent conditions to a probe that is a nucleic acid moleculecomprising one of the nucleotide sequences of the invention, preferablythe coding sequence set forth in SEQ ID NO:1, the cDNA of Patent DepositNumber PTA-1850, or a complement, fragment, or variant thereof.

As used herein, the term “hybridizes under stringent conditions” isintended to describe conditions for hybridization and washing underwhich nucleotide sequences having at least 60%, 65%, 70%, preferably 75%identity to each other typically remain hybridized to each other. Suchstringent conditions are known to those skilled in the art and can befound in Current Protocols in Molecular Biology (John Wiley & Sons, NewYork (1989)), 6.3.1-6.3.6. A preferred, non-limiting example ofstringent hybridization conditions is hybridization in 6× sodiumchloride/sodium citrate (SSC) at about 45° C., followed by one or morewashes in 0.2×SSC, 0.1% SDS at 50-65° C. In another preferredembodiment, stringent conditions comprise hybridization in 6×SSC at 42°C., followed by washing with 1×SSC at 55° C. Preferably, an isolatednucleic acid molecule that hybridizes under stringent conditions to anadenylate kinase sequence of the invention corresponds to anaturally-occurring nucleic acid molecule. As used herein, a“naturally-occurring” nucleic acid molecule refers to an RNA or DNAmolecule having a nucleotide sequence that occurs in nature (e.g.,encodes a natural protein).

Thus, in addition to the adenylate kinase nucleotide sequences disclosedherein and fragments and variants thereof, the isolated nucleic acidmolecules of the invention also encompass homologous DNA sequencesidentified and isolated from other cells and/or organisms byhybridization with entire or partial sequences obtained from theadenylate kinase nucleotide sequences disclosed herein or variants andfragments thereof.

The present invention also encompasses antisense nucleic acid molecules,i.e., molecules that are complementary to a sense nucleic acid encodinga protein, e.g., complementary to the coding strand of a double-strandedcDNA molecule, or complementary to an mRNA sequence. Accordingly, anantisense nucleic acid can hydrogen bond to a sense nucleic acid. Theantisense nucleic acid can be complementary to an entire adenylatekinase coding strand, or to only a portion thereof, e.g., all or part ofthe protein coding region (or open reading frame). An antisense nucleicacid molecule can be antisense to a noncoding region of the codingstrand of a nucleotide sequence encoding an adenylate kinase protein.The noncoding regions are the 5′ and 3′ sequences that flank the codingregion and are not translated into amino acids.

Given the coding-strand sequence encoding an adenylate kinase proteindisclosed herein (e.g., SEQ ID NO:1), antisense nucleic acids of theinvention can be designed according to the rules of Watson and Crickbase pairing. The antisense nucleic acid molecule can be complementaryto the entire coding region of adenylate kinase mRNA, but morepreferably is an oligonucleotide that is antisense to only a portion ofthe coding or noncoding region of adenylate kinase mRNA. For example,the antisense oligonucleotide can be complementary to the regionsurrounding the translation start site of adenylate kinase mRNA. Anantisense oligonucleotide can be, for example, about 5, 10, 15, 20, 25,30, 35, 40, 45, or 50 nucleotides in length. An antisense nucleic acidof the invention can be constructed using chemical synthesis andenzymatic ligation procedures known in the art.

For example, an antisense nucleic acid (e.g., an antisenseoligonucleotide) can be chemically synthesized using naturally occurringnucleotides or variously modified nucleotides designed to increase thebiological stability of the molecules or to increase the physicalstability of the duplex formed between the antisense and sense nucleicacids, including, but not limited to, for example e.g., phosphorothioatederivatives and acridine substituted nucleotides. Alternatively, theantisense nucleic acid can be produced biologically using an expressionvector into which a nucleic acid has been subcloned in an antisenseorientation (i.e., RNA transcribed from the inserted nucleic acid willbe of an antisense orientation to a target nucleic acid of interest,described further in the following subsection).

The antisense nucleic acid molecules of the invention are typicallyadministered to a subject or generated in situ such that they hybridizewith or bind to cellular mRNA and/or genomic DNA encoding an adenylatekinase protein to thereby inhibit expression of the protein, e.g., byinhibiting transcription and/or translation. An example of a route ofadministration of antisense nucleic acid molecules of the inventionincludes direct injection at a tissue site. Alternatively, antisensenucleic acid molecules can be modified to target selected cells and thenadministered systemically. For example, antisense molecules can belinked to peptides or antibodies to form a complex that specificallybinds to receptors or antigens expressed on a selected cell surface. Theantisense nucleic acid molecules can also be delivered to cells usingthe vectors described herein. To achieve sufficient intracellularconcentrations of the antisense molecules, vector constructs in whichthe antisense nucleic acid molecule is placed under the control of astrong pol II or pol III promoter are preferred.

An antisense nucleic acid molecule of the invention can be an α-anomericnucleic acid molecule. An α-anomeric nucleic acid molecule formsspecific double-stranded hybrids with complementary RNA in which,contrary to the usual β-units, the strands run parallel to each other(Gaultier et al. (1987) Nucleic Acids Res. 15:6625-6641). The antisensenucleic acid molecule can also comprise a 2′-o-methylribonucleotide(Inoue et al. (1987) Nucleic Acids Res. 15:6131-6148) or a chimericRNA-DNA analogue (Inoue et al. (1987) FEBS Lett. 215:327-330).

The invention also encompasses ribozymes, which are catalytic RNAmolecules with ribonuclease activity that are capable of cleaving asingle-stranded nucleic acid, such as an mRNA, to which they have acomplementary region. Ribozymes (e.g., hammerhead ribozymes (describedin Haselhoff and Gerlach (1988) Nature 334:585-591)) can be used tocatalytically cleave adenylate kinase mRNA transcripts to therebyinhibit translation of adenylate kinase mRNA. A ribozyme havingspecificity for an adenylate kinase-encoding nucleic acid can bedesigned based upon the nucleotide sequence of an adenylate kinase cDNAdisclosed herein (e.g., SEQ ID NO:1). See, e.g., Cech et al., U.S. Pat.No. 4,987,071; and Cech et al., U.S. Pat. No. 5,116,742. Alternatively,adenylate kinase mRNA can be used to select a catalytic RNA having aspecific ribonuclease activity from a pool of RNA molecules. See, e.g.,Bartel and Szostak (1993) Science 261:1411-1418.

The invention also encompasses nucleic acid molecules that form triplehelical structures. For example, adenylate kinase gene expression can beinhibited by targeting nucleotide sequences complementary to theregulatory region of the adenylate kinase protein (e.g., the adenylatekinase promoter and/or enhancers) to form triple helical structures thatprevent transcription of the adenylate kinase gene in target cells. Seegenerally Helene (1991) Anticancer Drug Des. 6(6):569; Helene (1992)Ann. N.Y. Acad. Sci. 660:27; and Maher (1992) Bioassays 14(12):807.

In preferred embodiments, the nucleic acid molecules of the inventioncan be modified at the base moiety, sugar moiety, or phosphate backboneto improve, e.g., the stability, hybridization, or solubility of themolecule. For example, the deoxyribose phosphate backbone of the nucleicacids can be modified to generate peptide nucleic acids (see Hyrup etal. (1996) Bioorganic & Medicinal Chemistry 4:5). As used herein, theterms “peptide nucleic acids” or “PNAs” refer to nucleic acid mimics,e.g., DNA mimics, in which the deoxyribose phosphate backbone isreplaced by a pseudopeptide backbone and only the four naturalnucleobases are retained. The neutral backbone of PNAs has been shown toallow for specific hybridization to DNA and RNA under conditions of lowionic strength. The synthesis of PNA oligomers can be performed usingstandard solid-phase peptide synthesis protocols as described in Hyrupet al. (1996), supra; Perry-O'Keefe et al. (1996) Proc. Natl. Acad. Sci.USA 93:14670.

PNAs of an adenylate kinase molecule can be used in therapeutic anddiagnostic applications. For example, PNAs can be used as antisense orantigene agents for sequence-specific modulation of gene expression by,e.g., inducing transcription or translation arrest or inhibitingreplication. PNAs of the invention can also be used, e.g., in theanalysis of single base pair mutations in a gene by, e.g., PNA-directedPCR clamping; as artificial restriction enzymes when used in combinationwith other enzymes, e.g., S1 nucleases (Hyrup (1996), supra; or asprobes or primers for DNA sequence and hybridization (Hyrup (1996),supra; Perry-O'Keefe et al. (1996), supra).

In another embodiment, PNAs of an adenylate kinase molecule can bemodified, e.g., to enhance their stability, specificity, or cellularuptake, by attaching lipophilic or other helper groups to PNA, by theformation of PNA-DNA chimeras, or by the use of liposomes or othertechniques of drug delivery known in the art. The synthesis of PNA-DNAchimeras can be performed as described in Hyrup (1996), supra; Finn etal. (1996) Nucleic Acids Res. 24(17):3357-63; Mag et al. (1989) NucleicAcids Res. 17:5973; and Peterson et al. (1975) Bioorganic Med. Chem.Lett. 5:1119.

II. Isolated Adenylate Kinase Proteins and Anti-Adenylate KinaseAntibodies

Adenylate kinase proteins are also encompassed within the presentinvention. By “adenylate kinase protein” is intended a protein havingthe amino acid sequence set forth in SEQ ID NO: 2, as well as fragments,biologically active portions, and variants thereof.

“Fragments” or “biologically active portions” include polypeptidefragments suitable for use as immunogens to raise anti-adenylate kinaseantibodies. Fragments include peptides comprising amino acid sequencessufficiently identical to or derived from the amino acid sequence of anadenylate kinase protein of the invention and exhibiting at least oneactivity of an adenylate kinase protein, but which include fewer aminoacids than the full-length (SEQ ID NO:2) adenylate kinase proteindisclosed herein. Typically, biologically active portions comprise adomain or motif with at least one activity of the adenylate kinaseprotein. A biologically active portion of an adenylate kinase proteincan be a polypeptide that is, for example, 10, 25, 50, 100 or more aminoacids in length. Such biologically active portions can be prepared byrecombinant techniques and evaluated for one or more of the functionalactivities of a native adenylate kinase protein. As used here, afragment comprises at least 5 contiguous amino acids of SEQ ID NO:2. Theinvention encompasses other fragments, however, such as any fragment inthe protein greater than 6, 7, 8, or 9 amino acids, depending upon theintended use.

By “variants” is intended proteins or polypeptides having an amino acidsequence that is at least about 45%, 55%, 65%, preferably about 75%,85%, 95%, or 98% identical to the amino acid sequence of SEQ ID NO:2.Variants also include polypeptides encoded by the cDNA insert of theplasmid deposited with ATCC as Patent Deposit Number PTA-1850, orpolypeptides encoded by a nucleic acid molecule that hybridizes to thenucleic acid molecule of SEQ ID NO:1, or a complement thereof, understringent conditions. Such variants generally retain the functionalactivity of the adenylate kinase proteins of the invention. Variantsinclude polypeptides that differ in amino acid sequence due to naturalallelic variation or mutagenesis.

The invention also provides adenylate kinase chimeric or fusionproteins. As used herein, an adenylate kinase “chimeric protein” or“fusion protein” comprises an adenylate kinase polypeptide operablylinked to a non-adenylate kinase polypeptide. A “adenylate kinasepolypeptide” refers to a polypeptide having an amino acid sequencecorresponding to an adenylate kinase protein, whereas a “non-adenylatekinase polypeptide” refers to a polypeptide having an amino acidsequence corresponding to a protein that is not substantially identicalto the adenylate kinase protein, e.g., a protein that is different fromthe adenylate kinase protein and which is derived from the same or adifferent organism. Within an adenylate kinase fusion protein, theadenylate kinase polypeptide can correspond to all or a portion of anadenylate kinase protein, preferably at least one biologically activeportion of an adenylate kinase protein. Within the fusion protein, theterm “operably linked” is intended to indicate that the adenylate kinasepolypeptide and the non-adenylate kinase polypeptide are fused in-frameto each other. The non-adenylate kinase polypeptide can be fused to theN-terminus or C-terminus of the adenylate kinase polypeptide.

One useful fusion protein is a GST-adenylate kinase fusion protein inwhich the adenylate kinase sequences are fused to the C-terminus of theGST sequences. Such fusion proteins can facilitate the purification ofrecombinant adenylate kinase proteins.

In yet another embodiment, the fusion protein is an adenylatekinase-immunoglobulin fusion protein in which all or part of anadenylate kinase protein is fused to sequences derived from a member ofthe immunoglobulin protein family. The adenylate kinase-immunoglobulinfusion proteins of the invention can be incorporated into pharmaceuticalcompositions and administered to a subject to inhibit an interactionbetween an adenylate kinase ligand and an adenylate kinase protein onthe surface of a cell, thereby suppressing adenylate kinase-mediatedsignal transduction in vivo. The adenylate kinase-immunoglobulin fusionproteins can be used to affect the bioavailability of an adenylatekinase cognate ligand. Inhibition of the adenylate kinaseligand/adenylate kinase interaction may be useful therapeutically, bothfor treating proliferative and differentiative disorders and formodulating (e.g., promoting or inhibiting) cell survival. Moreover, theadenylate kinase-immunoglobulin fusion proteins of the invention can beused as immunogens to produce anti-adenylate kinase antibodies in asubject, to purify adenylate kinase ligands, and in screening assays toidentify molecules that inhibit the interaction of an adenylate kinaseprotein with an adenylate kinase ligand.

Preferably, an adenylate kinase chimeric or fusion protein of theinvention is produced by standard recombinant DNA techniques. Forexample, DNA fragments coding for the different polypeptide sequencesmay be ligated together in-frame, or the fusion gene can be synthesized,such as with automated DNA synthesizers. Alternatively, PCRamplification of gene fragments can be carried out using anchor primersthat give rise to complementary overhangs between two consecutive genefragments, which can subsequently be annealed and reamplified togenerate a chimeric gene sequence (see, e.g., Ausubel et al., eds.(1995) Current Protocols in Molecular Biology) (Greene Publishing andWiley-Interscience, NY). Moreover, an adenylate kinase-encoding nucleicacid can be cloned into a commercially available expression vector suchthat it is linked in-frame to an existing fusion moiety.

Variants of the adenylate kinase proteins can function as eitheradenylate kinase agonists (mimetics) or as adenylate kinase antagonists.Variants of the adenylate kinase protein can be generated bymutagenesis, e.g., discrete point mutation or truncation of theadenylate kinase protein. An agonist of the adenylate kinase protein canretain substantially the same, or a subset, of the biological activitiesof the naturally occurring form of the adenylate kinase protein. Anantagonist of the adenylate kinase protein can inhibit one or more ofthe activities of the naturally occurring form of the adenylate kinaseprotein by, for example, competitively binding to a downstream orupstream member of a cellular signaling cascade that includes theadenylate kinase protein. Thus, specific biological effects can beelicited by treatment with a variant of limited function. Treatment of asubject with a variant having a subset of the biological activities ofthe naturally occurring form of the protein can have fewer side effectsin a subject relative to treatment with the naturally occurring form ofthe adenylate kinase proteins.

Variants of an adenylate kinase protein that function as eitheradenylate kinase agonists or as adenylate kinase antagonists can beidentified by screening combinatorial libraries of mutants, e.g.,truncation mutants, of an adenylate kinase protein for adenylate kinaseprotein agonist or antagonist activity. In one embodiment, a variegatedlibrary of adenylate kinase variants is generated by combinatorialmutagenesis at the nucleic acid level and is encoded by a variegatedgene library. A variegated library of adenylate kinase variants can beproduced by, for example, enzymatically ligating a mixture of syntheticoligonucleotides into gene sequences such that a degenerate set ofpotential adenylate kinase sequences is expressible as individualpolypeptides, or alternatively, as a set of larger fusion proteins(e.g., for phage display) containing the set of adenylate kinasesequences therein. There are a variety of methods that can be used toproduce libraries of potential adenylate kinase variants from adegenerate oligonucleotide sequence. Chemical synthesis of a degenerategene sequence can be performed in an automatic DNA synthesizer, and thesynthetic gene then ligated into an appropriate expression vector. Useof a degenerate set of genes allows for the provision, in one mixture,of all of the sequences encoding the desired set of potential adenylatekinase sequences. Methods for synthesizing degenerate oligonucleotidesare known in the art (see, e.g., Narang (1983) Tetrahedron 39:3; Itakuraet al. (1984) Annu. Rev. Biochem. 53:323; Itakura et al. (1984) Science198:1056; Ike et al. (1983) Nucleic Acid Res. 11:477).

In addition, libraries of fragments of an adenylate kinase proteincoding sequence can be used to generate a variegated population ofadenylate kinase fragments for screening and subsequent selection ofvariants of an adenylate kinase protein. In one embodiment, a library ofcoding sequence fragments can be generated by treating a double-strandedPCR fragment of an adenylate kinase coding sequence with a nucleaseunder conditions wherein nicking occurs only about once per molecule,denaturing the double-stranded DNA, renaturing the DNA to formdouble-stranded DNA which can include sense/antisense pairs fromdifferent nicked products, removing single-stranded portions fromreformed duplexes by treatment with S1 nuclease, and ligating theresulting fragment library into an expression vector. By this method,one can derive an expression library that encodes N-terminal andinternal fragments of various sizes of the adenylate kinase protein.

Several techniques are known in the art for screening gene products ofcombinatorial libraries made by point mutations or truncation and forscreening cDNA libraries for gene products having a selected property.Such techniques are adaptable for rapid screening of the gene librariesgenerated by the combinatorial mutagenesis of adenylate kinase proteins.The most widely used techniques, which are amenable to high through-putanalysis, for screening large gene libraries typically include cloningthe gene library into replicable expression vectors, transformingappropriate cells with the resulting library of vectors, and expressingthe combinatorial genes under conditions in which detection of a desiredactivity facilitates isolation of the vector encoding the gene whoseproduct was detected. Recursive ensemble mutagenesis (REM), a techniquethat enhances the frequency of functional mutants in the libraries, canbe used in combination with the screening assays to identify adenylatekinase variants (Arkin and Yourvan (1992) Proc. Natl. Acad. Sci. USA89:7811-7815; Delgrave et al. (1993) Protein Engineering 6(3):327-331).

An isolated adenylate kinase polypeptide of the invention can be used asan immunogen to generate antibodies that bind adenylate kinase proteinsusing standard techniques for polyclonal and monoclonal antibodypreparation. The full-length adenylate kinase protein can be used or,alternatively, the invention provides antigenic peptide fragments ofadenylate kinase proteins for use as immunogens. The antigenic peptideof an adenylate kinase protein comprises at least 8, preferably 10, 15,20, or 30 amino acid residues of the amino acid sequence shown in SEQ IDNO:2 and encompasses an epitope of an adenylate kinase protein such thatan antibody raised against the peptide forms a specific immune complexwith the adenylate kinase protein. Preferred epitopes encompassed by theantigenic peptide are regions of a adenylate kinase protein that arelocated on the surface of the protein, e.g., hydrophilic regions.

Accordingly, another aspect of the invention pertains to anti-adenylatekinase polyclonal and monoclonal antibodies that bind an adenylatekinase protein. Polyclonal anti-adenylate kinase antibodies can beprepared by immunizing a suitable subject (e.g., rabbit, goat, mouse, orother mammal) with an adenylate kinase immunogen. The anti-adenylatekinase antibody titer in the immunized subject can be monitored overtime by standard techniques, such as with an enzyme linked immunosorbentassay (ELISA) using immobilized adenylate kinase protein. At anappropriate time after immunization, e.g., when the anti-adenylatekinase antibody titers are highest, antibody-producing cells can beobtained from the subject and used to prepare monoclonal antibodies bystandard techniques, such as the hybridoma technique originallydescribed by Kohler and Milstein (1975) Nature 256:495-497, the human Bcell hybridoma technique (Kozbor et al. (1983) Immunol. Today 4:72), theEBV-hybridoma technique (Cole et al. (1985) in Monoclonal Antibodies andCancer Therapy, ed. Reisfeld and Sell (Alan R. Liss, Inc., New York,N.Y.), pp. 77-96) or trioma techniques. The technology for producinghybridomas is well known (see generally Coligan et al., eds. (1994)Current Protocols in Immunology (John Wiley & Sons, Inc., New York,N.Y.); Galfre et al. (1977) Nature 266:55052; Kenneth (1980) inMonoclonal Antibodies: A New Dimension In Biological Analyses (PlenumPublishing Corp., NY; and Lerner (1981) Yale J. Biol. Med., 54:387-402).

Alternative to preparing monoclonal antibody-secreting hybridomas, amonoclonal anti-adenylate kinase antibody can be identified and isolatedby screening a recombinant combinatorial immunoglobulin library (e.g.,an antibody phage display library) with an adenylate kinase protein tothereby isolate immunoglobulin library members that bind the adenylatekinase protein. Kits for generating and screening phage displaylibraries are commercially available (e.g., the Pharmacia RecombinantPhage Antibody System, Catalog No. 27-9400-01; and the StratageneSurfZAPθ Phage Display Kit, Catalog No. 240612). Additionally, examplesof methods and reagents particularly amenable for use in generating andscreening antibody display library can be found in, for example, U.S.Pat. No. 5,223,409; PCT Publication Nos. WO 92/18619; WO 91/17271; WO92/20791; WO 92/15679; 93/01288; WO 92/01047; 92/09690; and 90/02809;Fuchs et al. (1991) Bio/Technology 9:1370-1372; Hay et al. (1992) Hum.Antibod. Hybridomas 3:81-85; Huse et al. (1989) Science 246:1275-1281;Griffiths et al. (1993) EMBO J. 12:725-734.

Additionally, recombinant anti-adenylate kinase antibodies, such aschimeric and humanized monoclonal antibodies, comprising both human andnonhuman portions, which can be made using standard recombinant DNAtechniques, are within the scope of the invention. Such chimeric andhumanized monoclonal antibodies can be produced by recombinant DNAtechniques known in the art, for example using methods described in PCTPublication Nos. WO 86101533 and WO 87/02671; European PatentApplication Nos. 184,187, 171,496, 125,023, and 173,494; U.S. Pat. Nos.4,816,567 and 5,225,539; European Patent Application 125,023; Better etal. (1988) Science 240:1041-1043; Liu et al. (1987) Proc. Natl. Acad.Sci. USA 84:3439-3443; Liu et al. (1987) J. Immunol. 139:3521-3526; Sunet al. (1987) Proc. Natl. Acad. Sci. USA 84:214-218; Nishimura et al.(1987) Canc. Res. 47:999-1005; Wood et al. (1985) Nature 314:446-449;Shaw et al. (1988) J. Natl. Cancer Inst. 80:1553-1559); Morrison (1985)Science 229:1202-1207; Oi et al. (1986) Bio/Techniques 4:214; Jones etal. (1986) Nature 321:552-525; Verhoeyan et al. (1988) Science 239:1534;and Beidler et al. (1988) J. Immunol. 141:4053-4060.

Completely human antibodies are particularly desirable for therapeutictreatment of human patients. Such antibodies can be produced usingtransgenic mice that are incapable of expressing endogenousimmunoglobulin heavy and light chains genes, but which can express humanheavy and light chain genes. See, for example, Lonberg and Huszar (1995)Int. Rev. Immunol. 13:65-93); and U.S. Pat. Nos. 5,625,126; 5,633,425;5,569,825; 5,661,016; and 5,545,806. In addition, companies such asAbgenix, Inc. (Freemont, Calif.), can be engaged to provide humanantibodies directed against a selected antigen using technology similarto that described above.

Completely human antibodies that recognize a selected epitope can begenerated using a technique referred to as “guided selection.” In thisapproach a selected non-human monoclonal antibody, e.g., a murineantibody, is used to guide the selection of a completely human antibodyrecognizing the same epitope. This technology is described by Jespers etal. (1994) Bio/Technology 12:899-903).

An anti-adenylate kinase antibody (e.g., monoclonal antibody) can beused to isolate adenylate kinase proteins by standard techniques, suchas affinity chromatography or immunoprecipitation. An anti-adenylatekinase antibody can facilitate the purification of natural adenylatekinase protein from cells and of recombinantly produced adenylate kinaseprotein expressed in host cells. Moreover, an anti-adenylate kinaseantibody can be used to detect adenylate kinase protein (e.g., in acellular lysate or cell supernatant) in order to evaluate the abundanceand pattern of expression of the adenylate kinase protein.Anti-adenylate kinase antibodies can be used diagnostically to monitorprotein levels in tissue as part of a clinical testing procedure, e.g.,to, for example, determine the efficacy of a given treatment regimen.Detection can be facilitated by coupling the antibody to a detectablesubstance. Examples of detectable substances include various enzymes,prosthetic groups, fluorescent materials, luminescent materials,bioluminescent materials, and radioactive materials. Examples ofsuitable enzymes include horseradish peroxidase, alkaline phosphatase,β-galactosidase, or acetylcholinesterase; examples of suitableprosthetic group complexes include streptavidin/biotin andavidin/biotin; examples of suitable fluorescent materials includeumbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine,dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; anexample of a luminescent material includes luminol; examples ofbioluminescent materials include luciferase, luciferin, and aequorin;and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S,or ³H.

Further, an antibody (or fragment thereof) may be conjugated to atherapeutic moiety such as a cytotoxin, a therapeutic agent or aradioactive metal ion. A cytotoxin or cytotoxic agent includes any agentthat is detrimental to cells. Examples include taxol, cytochalasin B,gramicidin D, ethidium bromide, emetine, mitomycin, etoposide,tenoposide, vincristine, vinblastine, colchicin, doxorubicin,daunorubicin, dihydroxy anthracin dione, mitoxantrone, mithramycin,actinomycin D,l-dehydrotestosterone, glucocorticoids, procaine,tetracaine, lidocaine, propranolol, and puromycin and analogs orhomologs thereof. Therapeutic agents include, but are not limited to,antimetabolites (e.g., methotrexate, 6-mercaptopurine, 6-thioguanine,cytarabine, 5-fluorouracil decarbazine), alkylating agents (e.g.,mechlorethamine, thioepa chlorambucil, melphalan, carmustine (BSNU) andlomustine (CCNU), cyclothosphamide, busulfan, dibromomannitol,streptozotocin, mitomycin C, and cis-dichlorodiamine platinum (II) (DDP)cisplatin), anthracyclines (e.g., daunorubicin (formerly daunomycin) anddoxorubicin), antibiotics (e.g., dactinomycin (formerly actinomycin),bleomycin, mithramycin, and anthramycin (AMC)), and anti-mitotic agents(e.g., vincristine and vinblastine). The conjugates of the invention canbe used for modifying a given biological response, the drug moiety isnot to be construed as limited to classical chemical therapeutic agents.For example, the drug moiety may be a protein or polypeptide possessinga desired biological activity. Such proteins may include, for example, atoxin such as abrin, ricin A, pseudomonas exotoxin, or diphtheria toxin;a protein such as tumor necrosis factor, .alpha.-interferon,.beta.-interferon, nerve growth factor, platelet derived growth factor,tissue plasminogen activator; or, biological response modifiers such as,for example, lymphokines, interleukin-1 (“IL-1”), interleukin-2(“IL-2”), interleukin-6 (“IL-6”), granulocyte macrophase colonystimulating factor (“GM-CSF”), granulocyte colony stimulating factor(“G-CSF”), or other growth factors.

Techniques for conjugating such therapeutic moiety to antibodies arewell known, see, e.g., Amon et al. (1985) “Monoclonal Antibodies forImmunotargeting of Drugs in Cancer Therapy,” in Monoclonal AntibodiesAnd Cancer Therapy, ed. Reisfeld et al. (Alan R. Liss, Inc.), pp.243-56); Hellstrom et al. (1987) “Antibodies for Drug Delivery,” inControlled Drug Delivery, ed. Robinson et al. (2d ed., Marcel Dekker,Inc.), pp. 623-53; Thorpe (1985) “Antibody Carriers of Cytotoxic Agentsin Cancer Therapy: A Review”, in Monoclonal Antibodies '84:BiologicalAnd Clinical Applications, ed. Pinchera et al., pp. 475-506; “Analysis,Results, and Future Prospective of the Therapeutic Use of RadiolabeledAntibody in Cancer Therapy,” in Monoclonal Antibodies For CancerDetection And Therapy, ed. Baldwin et al. (Academic Press, NY), pp.303-316; and Thorpe et al. (1982) Immunol. Rev. 62:119-58.Alternatively, an antibody can be conjugated to a second antibody toform an antibody heteroconjugate as described by Segal in U.S. Pat. No.4,676,980.

III. Recombinant Expression Vectors and Host Cells

Another aspect of the invention pertains to vectors, preferablyexpression vectors, containing a nucleic acid encoding an adenylatekinase protein (or a portion thereof). “Vector” refers to a nucleic acidmolecule capable of transporting another nucleic acid to which it hasbeen linked, such as a “plasmid”, a circular double-stranded DNA loopinto which additional DNA segments can be ligated, or a viral vector,where additional DNA segments can be ligated into the viral genome. Thevectors are useful for autonomous replication in a host cell or may beintegrated into the genome of a host cell upon introduction into thehost cell, and thereby are replicated along with the host genome (e.g.,nonepisomal mammalian vectors). Expression vectors are capable ofdirecting the expression of genes to which they are operably linked. Ingeneral, expression vectors of utility in recombinant DNA techniques areoften in the form of plasmids (vectors). However, the invention isintended to include such other forms of expression vectors, such asviral vectors (e.g., replication defective retroviruses, adenoviruses,and adeno-associated viruses), that serve equivalent functions.

The recombinant expression vectors of the invention comprise a nucleicacid of the invention in a form suitable for expression of the nucleicacid in a host cell. This means that the recombinant expression vectorsinclude one or more regulatory sequences, selected on the basis of thehost cells to be used for expression, operably linked to the nucleicacid sequence to be expressed. “Operably linked” is intended to meanthat the nucleotide sequence of interest is linked to the regulatorysequence(s) in a manner that allows for expression of the nucleotidesequence (e.g., in an in vitro transcription/translation system or in ahost cell when the vector is introduced into the host cell). The term“regulatory sequence” is intended to include promoters, enhancers, andother expression control elements (e.g., polyadenylation signals). See,for example, Goeddel (1990) in Gene Expression Technology: Methods inEnzymology 185 (Academic Press, San Diego, Calif.). Regulatory sequencesinclude those that direct constitutive expression of a nucleotidesequence in many types of host cell and those that direct expression ofthe nucleotide sequence only in certain host cells (e.g.,tissue-specific regulatory sequences). It will be appreciated by thoseskilled in the art that the design of the expression vector can dependon such factors as the choice of the host cell to be transformed, thelevel of expression of protein desired, etc. The expression vectors ofthe invention can be introduced into host cells to thereby produceproteins or peptides, including fusion proteins or peptides, encoded bynucleic acids as described herein (e.g., adenylate kinase proteins,mutant forms of adenylate kinase proteins, fusion proteins, etc.).

The recombinant expression vectors of the invention can be designed forexpression of adenylate kinase protein in prokaryotic or eukaryotic hostcells. Expression of proteins in prokaryotes is most often carried outin E. coli with vectors containing constitutive or inducible promotersdirecting the expression of either fusion or nonfusion proteins. Fusionvectors add a number of amino acids to a protein encoded therein,usually to the amino terminus of the recombinant protein. Typical fusionexpression vectors include pGEX (Pharmacia Biotech Inc; Smith andJohnson (1988) Gene 67:31-40), pMAL (New England Biolabs, Beverly,Mass.), and pRIT5 (Pharmacia, Piscataway, N.J.) which fuse glutathioneS-transferase (GST), maltose E binding protein, or protein A,respectively, to the target recombinant protein. Examples of suitableinducible nonfusion E. coli expression vectors include pTrc (Amann etal. (1988) Gene 69:301-315) and pET Id (Studier et al. (1990) in GeneExpression Technology: Methods in Enzymology 185 (Academic Press, SanDiego, Calif.), pp. 60-89). Strategies to maximize recombinant proteinexpression in E. coli can be found in Gottesman (1990) in GeneExpression Technology: Methods in Enzymology 185 (Academic Press, CA),pp. 119-128 and Wada et al. (1992) Nucleic Acids Res. 20:2111-2118.Target gene expression from the pTrc vector relies on host RNApolymerase transcription from a hybrid trp-lac fusion promoter.

Suitable eukaryotic host cells include insect cells (examples ofBaculovirus vectors available for expression of proteins in culturedinsect cells (e.g., Sf9 cells) include the pAc series (Smith et al.(1983) Mol. Cell. Biol. 3:2156-2165) and the pVL series (Lucklow andSummers (1989) Virology 170:31-39)); yeast cells (examples of vectorsfor expression in yeast S. cereivisiae include pYepSec1 (Baldari et al.(1987) EMBO J. 6:229-234), pMFa (Kurjan and Herskowitz (1982) Cell30:933-943), pJRY88 (Schultz et al. (1987) Gene 54:113-123), pYES2(Invitrogen Corporation, San Diego, Calif.), and pPicZ (InvitrogenCorporation, San Diego, Calif.)); or mammalian cells (mammalianexpression vectors include pCDM8 (Seed (1987) Nature 329:840) and pMT2PC(Kaufman et al. (1987) EMBO J. 6:187:195)). Suitable mammalian cellsinclude Chinese hamster ovary cells (CHO) or COS cells. In mammaliancells, the expression vector's control functions are often provided byviral regulatory elements. For example, commonly used promoters arederived from polyoma, Adenovirus 2, cytomegalovirus, and Simian Virus40. For other suitable expression systems for both prokaryotic andeukaryotic cells, see chapters 16 and 17 of Sambrook et al. (1989)Molecular cloning: A Laboratory Manual (2d ed., Cold Spring HarborLaboratory Press, Plainview, N.Y.). See, Goeddel (1990) in GeneExpression Technology: Methods in Enzymology 185 (Academic Press, SanDiego, Calif.). Alternatively, the recombinant expression vector can betranscribed and translated in vitro, for example using T7 promoterregulatory sequences and T7 polymerase.

The terms “host cell” and “recombinant host cell” are usedinterchangeably herein. It is understood that such terms refer not onlyto the particular subject cell but to the progeny or potential progenyof such a cell. Because certain modifications may occur in succeedinggenerations due to either mutation or environmental influences, suchprogeny may not, in fact, be identical to the parent cell but are stillincluded within the scope of the term as used herein.

In one embodiment, the expression vector is a recombinant mammalianexpression vector that comprises tissue-specific regulatory elementsthat direct expression of the nucleic acid preferentially in aparticular cell type. Suitable tissue-specific promoters include thealbumin promoter (liver-specific; Pinkert et al. (1987) Genes Dev.1:268-277), lymphoid-specific promoters (Calame and Eaton (1988) Adv.Immunol. 43:235-275), in particular promoters of T cell receptors(Winoto and Baltimore (1989) EMBO J. 8:729-733) and immunoglobulins(Baneji et al. (1983) Cell 33:729-740; Queen and Baltimore (1983) Cell33:741-748), neuron-specific promoters (e.g., the neurofilamentpromoter; Byrne and Ruddle (1989) Proc. Natl. Acad. Sci. USA86:5473-5477), pancreas-specific promoters (Edlund et al. (1985) Science230:912-916), and mammary gland-specific promoters (e.g., milk wheypromoter; U.S. Pat. No. 4,873,316 and European Application PatentPublication No. 264,166). Developmentally-regulated promoters are alsoencompassed, for example the murine hox promoters (Kessel and Gruss(1990) Science 249:374-379), the α-fetoprotein promoter (Campes andTilghman (1989) Genes Dev. 3:537-546), and the like.

The invention further provides a recombinant expression vectorcomprising a DNA molecule of the invention cloned into the expressionvector in an antisense orientation. That is, the DNA molecule isoperably linked to a regulatory sequence in a manner that allows forexpression (by transcription of the DNA molecule) of an RNA moleculethat is antisense to adenylate kinase mRNA. Regulatory sequencesoperably linked to a nucleic acid cloned in the antisense orientationcan be chosen to direct the continuous expression of the antisense RNAmolecule in a variety of cell types, for instance viral promoters and/orenhancers, or regulatory sequences can be chosen to direct constitutive,tissue-specific, or cell-type-specific expression of antisense RNA. Theantisense expression vector can be in the form of a recombinant plasmid,phagemid, or attenuated virus in which antisense nucleic acids areproduced under the control of a high efficiency regulatory region, theactivity of which can be determined by the cell type into which thevector is introduced. For a discussion of the regulation of geneexpression using antisense genes see Weintraub et al. (1986)Reviews—Trends in Genetics, Vol. 1(1).

Vector DNA can be introduced into prokaryotic or eukaryotic cells viaconventional transformation or transfection techniques. As used herein,the terms “transformation” and “transfection” are intended to refer to avariety of art-recognized techniques for introducing foreign nucleicacid (e.g., DNA) into a host cell, including calcium phosphate orcalcium chloride co-precipitation, DEAE-dextran-mediated transfection,lipofection, or electroporation. Suitable methods for transforming ortransfecting host cells can be found in Sambrook et al. (1989) MolecularCloning: A Laboratory Manual (2d ed., Cold Spring Harbor LaboratoryPress, Plainview, N.Y.) and other laboratory manuals.

For stable transfection of mammalian cells, it is known that, dependingupon the expression vector and transfection technique used, only a smallfraction of cells may integrate the foreign DNA into their genome. Inorder to identify and select these integrants, a gene that encodes aselectable marker (e.g., for resistance to antibiotics) is generallyintroduced into the host cells along with the gene of interest.Preferred selectable markers include those which confer resistance todrugs, such as G418, hygromycin, and methotrexate. Nucleic acid encodinga selectable marker can be introduced into a host cell on the samevector as that encoding an adenylate kinase protein or can be introducedon a separate vector. Cells stably transfected with the introducednucleic acid can be identified by drug selection (e.g., cells that haveincorporated the selectable marker gene will survive, while the othercells die).

A host cell of the invention, such as a prokaryotic or eukaryotic hostcell in culture, can be used to produce (i.e., express) adenylate kinaseprotein. Accordingly, the invention further provides methods forproducing adenylate kinase protein using the host cells of theinvention. In one embodiment, the method comprises culturing the hostcell of the invention, into which a recombinant expression vectorencoding an adenylate kinase protein has been introduced, in a suitablemedium such that adenylate kinase protein is produced. In anotherembodiment, the method further comprises isolating adenylate kinaseprotein from the medium or the host cell.

The host cells of the invention can also be used to produce nonhumantransgenic animals. In general, methods for producing transgenic animalsinclude introducing a nucleic acid sequence according to the presentinvention, the nucleic acid sequence capable of expressing the receptorprotein in a transgenic animal, into a cell in culture or in vivo. Whenintroduced in vivo, the nucleic acid is introduced into an intactorganism such that one or more cell types and, accordingly, one or moretissue types, express the nucleic acid encoding the receptor protein.Alternatively, the nucleic acid can be introduced into virtually allcells in an organism by transfecting a cell in culture, such as anembryonic stem cell, as described herein for the production oftransgenic animals, and this cell can be used to produce an entiretransgenic organism. As described, in a further embodiment, the hostcell can be a fertilized oocyte. Such cells are then allowed to developin a female foster animal to produce the transgenic organism.

For example, in one embodiment, a host cell of the invention is afertilized oocyte or an embryonic stem cell into which adenylatekinase-coding sequences have been introduced. Such host cells can thenbe used to create nonhuman transgenic animals in which exogenousadenylate kinase sequences have been introduced into their genome orhomologous recombinant animals in which endogenous adenylate kinasesequences have been altered. Such animals are useful for studying thefunction and/or activity of adenylate kinase genes and proteins and foridentifying and/or evaluating modulators of adenylate kinase activity.As used herein, a “transgenic animal” is a nonhuman animal, preferably amammal, more preferably a rodent such as a rat or mouse, in which one ormore of the cells of the animal includes a transgene. Other examples oftransgenic animals include nonhuman primates, sheep, dogs, cows, goats,chickens, amphibians, etc. A transgene is exogenous DNA that isintegrated into the genome of a cell from which a transgenic animaldevelops and which remains in the genome of the mature animal, therebydirecting the expression of an encoded gene product in one or more celltypes or tissues of the transgenic animal. As used herein, a “homologousrecombinant animal” is a nonhuman animal, preferably a mammal, morepreferably a mouse, in which an endogenous adenylate kinase gene hasbeen altered by homologous recombination between the endogenous gene andan exogenous DNA molecule introduced into a cell of the animal, e.g., anembryonic cell of the animal, prior to development of the animal.

A transgenic animal of the invention can be created by introducingadenylate kinase-encoding nucleic acid into the male pronuclei of afertilized oocyte, e.g., by microinjection, retroviral infection, andallowing the oocyte to develop in a pseudopregnant female foster animal.The adenylate kinase cDNA sequence can be introduced as a transgene intothe genome of a nonhuman animal. Alternatively, a homologue of the mouseadenylate kinase gene can be isolated based on hybridization and used asa transgene. Intronic sequences and polyadenylation signals can also beincluded in the transgene to increase the efficiency of expression ofthe transgene. A tissue-specific regulatory sequence(s) can be operablylinked to the adenylate kinase transgene to direct expression ofadenylate kinase protein to particular cells. Methods for generatingtransgenic animals via embryo manipulation and microinjection,particularly animals such as mice, have become conventional in the artand are described, for example, in U.S. Pat. Nos. 4,736,866, 4,870,009,and 4,873,191 and in Hogan (1986) Manipulating the Mouse Embryo (ColdSpring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1986). Similarmethods are used for production of other transgenic animals. Atransgenic founder animal can be identified based upon the presence ofthe adenylate kinase transgene in its genome and/or expression ofadenylate kinase mRNA in tissues or cells of the animals. A transgenicfounder animal can then be used to breed additional animals carrying thetransgene. Moreover, transgenic animals carrying a transgene encodingadenylate kinase gene can further be bred to other transgenic animalscarrying other transgenes.

To create a homologous recombinant animal, one prepares a vectorcontaining at least a portion of an adenylate kinase gene or a homologof the gene into which a deletion, addition, or substitution has beenintroduced to thereby alter, e.g., functionally disrupt, the adenylatekinase gene. In a preferred embodiment, the vector is designed suchthat, upon homologous recombination, the endogenous adenylate kinasegene is functionally disrupted (i.e., no longer encodes a functionalprotein; also referred to as a “knock out” vector). Alternatively, thevector can be designed such that, upon homologous recombination, theendogenous adenylate kinase gene is mutated or otherwise altered butstill encodes functional protein (e.g., the upstream regulatory regioncan be altered to thereby alter the expression of the endogenousadenylate kinase protein). In the homologous recombination vector, thealtered portion of the adenylate kinase gene is flanked at its 5N and 3Nends by additional nucleic acid of the adenylate kinase gene to allowfor homologous recombination to occur between the exogenous adenylatekinase gene carried by the vector and an endogenous adenylate kinasegene in an embryonic stem cell. The additional flanking adenylate kinasenucleic acid is of sufficient length for successful homologousrecombination with the endogenous gene. Typically, several kilobases offlanking DNA (both at the 5′ and 3′ ends) are included in the vector(see, e.g., Thomas and Capecchi (1987) Cell 51:503 for a description ofhomologous recombination vectors). The vector is introduced into anembryonic stem cell line (e.g., by electroporation), and cells in whichthe introduced adenylate kinase gene has homologously recombined withthe endogenous adenylate kinase gene are selected (see, e.g., Li et al.(1992) Cell 69:915). The selected cells are then injected into ablastocyst of an animal (e.g., a mouse) to form aggregation chimeras(see, e.g., Bradley (1987) in Teratocarcinomas and Embryonic Stem Cells:A Practical Approach, ed. Robertson (IRL, Oxford), pp. 113-152). Achimeric embryo can then be implanted into a suitable pseudopregnantfemale foster animal and the embryo brought to term. Progeny harboringthe homologously recombined DNA in their germ cells can be used to breedanimals in which all cells of the animal contain the homologouslyrecombined DNA by germline transmission of the transgene. Methods forconstructing homologous recombination vectors and homologous recombinantanimals are described further in Bradley (1991) Current Opinion inBio/Technology 2:823-829 and in PCT Publication Nos. WO 90/11354, WO91/01140, WO 92/0968, and WO 93/04169.

In another embodiment, transgenic nonhuman animals containing selectedsystems that allow for regulated expression of the transgene can beproduced. One example of such a system is the cre/loxP recombinasesystem of bacteriophage P1. For a description of the cre/loxPrecombinase system, see, e.g., Lakso et al. (1992) Proc. Natl. Acad.Sci. USA 89:6232-6236. Another example of a recombinase system is theFLP recombinase system of Saccharomyces cerevisiae (O'Gorman et al.(1991) Science 251:1351-1355). If a cre/loxP recombinase system is usedto regulate expression of the transgene, animals containing transgenesencoding both the Cre recombinase and a selected protein are required.Such animals can be provided through the construction of “double”transgenic animals, e.g., by mating two transgenic animals, onecontaining a transgene encoding a selected protein and the othercontaining a transgene encoding a recombinase.

Clones of the nonhuman transgenic animals described herein can also beproduced according to the methods described in Wilmut et al. (1997)Nature 385:810-813 and PCT Publication Nos. WO 97/07668 and WO 97/07669.

IV. Pharmaceutical Compositions

The adenylate kinase nucleic acid molecules, adenylate kinase proteins,and anti-adenylate kinase antibodies (also referred to herein as “activecompounds”) of the invention can be incorporated into pharmaceuticalcompositions suitable for administration. Such compositions typicallycomprise the nucleic acid molecule, protein, or antibody and apharmaceutically acceptable carrier. As used herein the language“pharmaceutically acceptable carrier” is intended to include any and allsolvents, dispersion media, coatings, antibacterial and antifungalagents, isotonic and absorption delaying agents, and the like,compatible with pharmaceutical administration. The use of such media andagents for pharmaceutically active substances is well known in the art.Except insofar as any conventional media or agent is incompatible withthe active compound, use thereof in the compositions is contemplated.Supplementary active compounds can also be incorporated into thecompositions.

The compositions of the invention are useful to treat any of thedisorders discussed herein. The compositions are provided intherapeutically effective amounts. By “therapeutically effectiveamounts” is intended an amount sufficient to modulate the desiredresponse. As defined herein, a therapeutically effective amount ofprotein or polypeptide (i.e., an effective dosage) ranges from about0.001 to 30 mg/kg body weight, preferably about 0.01 to 25 mg/kg bodyweight, more preferably about 0.1 to 20 mg/kg body weight, and even morepreferably about 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7mg/kg, or 5 to 6 mg/kg body weight.

The skilled artisan will appreciate that certain factors may influencethe dosage required to effectively treat a subject, including but notlimited to the severity of the disease or disorder, previous treatments,the general health and/or age of the subject, and other diseasespresent. Moreover, treatment of a subject with a therapeuticallyeffective amount of a protein, polypeptide, or antibody can include asingle treatment or, preferably, can include a series of treatments. Ina preferred example, a subject is treated with antibody, protein, orpolypeptide in the range of between about 0.1 to 20 mg/kg body weight,one time per week for between about 1 to 10 weeks, preferably between 2to 8 weeks, more preferably between about 3 to 7 weeks, and even morepreferably for about 4, 5, or 6 weeks. It will also be appreciated thatthe effective dosage of antibody, protein, or polypeptide used fortreatment may increase or decrease over the course of a particulartreatment. Changes in dosage may result and become apparent from theresults of diagnostic assays as described herein.

The present invention encompasses agents that modulate expression oractivity. An agent may, for example, be a small molecule. For example,such small molecules include, but are not limited to, peptides,peptidomimetics, amino acids, amino acid analogs, polynucleotides,polynucleotide analogs, nucleotides, nucleotide analogs, organic orinorganic compounds (i.e., including heteroorganic and organometalliccompounds) having a molecular weight less than about 10,000 grams permole, organic or inorganic compounds having a molecular weight less thanabout 5,000 grams per mole, organic or inorganic compounds having amolecular weight less than about 1,000 grams per mole, organic orinorganic compounds having a molecular weight less than about 500 gramsper mole, and salts, esters, and other pharmaceutically acceptable formsof such compounds.

It is understood that appropriate doses of small molecule agents dependsupon a number of factors within the ken of the ordinarily skilledphysician, veterinarian, or researcher. The dose(s) of the smallmolecule will vary, for example, depending upon the identity, size, andcondition of the subject or sample being treated, further depending uponthe route by which the composition is to be administered, if applicable,and the effect which the practitioner desires the small molecule to haveupon the nucleic acid or polypeptide of the invention. Exemplary dosesinclude milligram or microgram amounts of the small molecule perkilogram of subject or sample weight (e.g., about 1 microgram perkilogram to about 500 milligrams per kilogram, about 100 micrograms perkilogram to about 5 milligrams per kilogram, or about 1 microgram perkilogram to about 50 micrograms per kilogram. It is furthermoreunderstood that appropriate doses of a small molecule depend upon thepotency of the small molecule with respect to the expression or activityto be modulated. Such appropriate doses may be determined using theassays described herein. When one or more of these small molecules is tobe administered to an animal (e.g., a human) in order to modulateexpression or activity of a polypeptide or nucleic acid of theinvention, a physician, veterinarian, or researcher may, for example,prescribe a relatively low dose at first, subsequently increasing thedose until an appropriate response is obtained. In addition, it isunderstood that the specific dose level for any particular animalsubject will depend upon a variety of factors including the activity ofthe specific compound employed, the age, body weight, general health,gender, and diet of the subject, the time of administration, the routeof administration, the rate of excretion, any drug combination, and thedegree of expression or activity to be modulated.

A pharmaceutical composition of the invention is formulated to becompatible with its intended route of administration. Examples of routesof administration include parenteral, e.g., intravenous, intradermal,subcutaneous, oral (e.g., inhalation), transdermal (topical),transmucosal, and rectal administration. Solutions or suspensions usedfor parenteral, intradermal, or subcutaneous application can include thefollowing components: a sterile diluent such as water for injection,saline solution, fixed oils, polyethylene glycols, glycerine, propyleneglycol or other synthetic solvents; antibacterial agents such as benzylalcohol or methyl parabens; antioxidants such as ascorbic acid or sodiumbisulfite; chelating agents such as ethylenediaminetetraacetic acid;buffers such as acetates, citrates or phosphates and agents for theadjustment of tonicity such as sodium chloride or dextrose. pH can beadjusted with acids or bases, such as hydrochloric acid or sodiumhydroxide. The parenteral preparation can be enclosed in ampoules,disposable syringes, or multiple dose vials made of glass or plastic.

Pharmaceutical compositions suitable for injectable use include sterileaqueous solutions (where water soluble) or dispersions and sterilepowders for the extemporaneous preparation of sterile injectablesolutions or dispersions. For intravenous administration, suitablecarriers include physiological saline, bacteriostatic water, CremophorEL® (BASF; Parsippany, N.J.), or phosphate buffered saline (PBS). In allcases, the composition must be sterile and should be fluid to the extentthat easy syringability exists. It must be stable under the conditionsof manufacture and storage and must be preserved against thecontaminating action of microorganisms such as bacteria and fungi. Thecarrier can be a solvent or dispersion medium containing, for example,water, ethanol, polyol (for example, glycerol, propylene glycol, andliquid polyethylene glycol, and the like), and suitable mixturesthereof. The proper fluidity can be maintained, for example, by the useof a coating such as lecithin, by the maintenance of the requiredparticle size in the case of dispersion, and by the use of surfactants.Prevention of the action of microorganisms can be achieved by variousantibacterial and antifungal agents, for example, parabens,chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In manycases, it will be preferable to include isotonic agents, for example,sugars, polyalcohols such as mannitol, sorbitol, sodium chloride, in thecomposition. Prolonged absorption of the injectable compositions can bebrought about by including in the composition an agent that delaysabsorption, for example, aluminum monostearate and gelatin.

Sterile injectable solutions can be prepared by incorporating the activecompound (e.g., an adenylate kinase protein or anti-adenylate kinaseantibody) in the required amount in an appropriate solvent with one or acombination of ingredients enumerated above, as required, followed byfiltered sterilization. Generally, dispersions are prepared byincorporating the active compound into a sterile vehicle that contains abasic dispersion medium and the required other ingredients from thoseenumerated above. In the case of sterile powders for the preparation ofsterile injectable solutions, the preferred methods of preparation arevacuum drying and freeze-drying, which yields a powder of the activeingredient plus any additional desired ingredient from a previouslysterile-filtered solution thereof.

Oral compositions generally include an inert diluent or an ediblecarrier. They can be enclosed in gelatin capsules or compressed intotablets. For the purpose of oral therapeutic administration, the activecompound can be incorporated with excipients and used in the form oftablets, troches, or capsules. Oral compositions can also be preparedusing a fluid carrier for use as a mouthwash, wherein the compound inthe fluid carrier is applied orally and swished and expectorated orswallowed. Pharmaceutically compatible binding agents, and/or adjuvantmaterials can be included as part of the composition. The tablets,pills, capsules, troches and the like can contain any of the followingingredients, or compounds of a similar nature: a binder such asmicrocrystalline cellulose, gum tragacanth, or gelatin; an excipientsuch as starch or lactose, a disintegrating agent such as alginic acid,Primogel, or corn starch; a lubricant such as magnesium stearate orSterotes; a glidant such as colloidal silicon dioxide; a sweeteningagent such as sucrose or saccharin; or a flavoring agent such aspeppermint, methyl salicylate, or orange flavoring. For administrationby inhalation, the compounds are delivered in the form of an aerosolspray from a pressurized container or dispenser that contains a suitablepropellant, e.g., a gas such as carbon dioxide, or a nebulizer.

Systemic administration can also be by transmucosal or transdermalmeans. For transmucosal or transdermal administration, penetrantsappropriate to the barrier to be permeated are used in the formulation.Such penetrants are generally known in the art, and include, forexample, for transmucosal administration, detergents, bile salts, andfusidic acid derivatives. Transmucosal administration can beaccomplished through the use of nasal sprays or suppositories. Fortransdermal administration, the active compounds are formulated intoointments, salves, gels, or creams as generally known in the art. Thecompounds can also be prepared in the form of suppositories (e.g., withconventional suppository bases such as cocoa butter and otherglycerides) or retention enemas for rectal delivery.

In one embodiment, the active compounds are prepared with carriers thatwill protect the compound against rapid elimination from the body, suchas a controlled release formulation, including implants andmicroencapsulated delivery systems. Biodegradable, biocompatiblepolymers can be used, such as ethylene vinyl acetate, polyanhydrides,polyglycolic acid, collagen, polyorthoesters, and polylactic acid.Methods for preparation of such formulations will be apparent to thoseskilled in the art. The materials can also be obtained commercially fromAlza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions(including liposomes targeted to infected cells with monoclonalantibodies to viral antigens) can also be used as pharmaceuticallyacceptable carriers. These can be prepared according to methods known tothose skilled in the art, for example, as described in U.S. Pat. No.4,522,811.

It is especially advantageous to formulate oral or parenteralcompositions in dosage unit form for ease of administration anduniformity of dosage. Dosage unit form as used herein refers tophysically discrete units suited as unitary dosages for the subject tobe treated with each unit containing a predetermined quantity of activecompound calculated to produce the desired therapeutic effect inassociation with the required pharmaceutical carrier. Depending on thetype and severity of the disease, about 1 μg/kg to about 15 mg/kg (e.g.,0.1 to 20 mg/kg) of antibody is an initial candidate dosage foradministration to the patient, whether, for example, by one or moreseparate administrations, or by continuous infusion. A typical dailydosage might range from about 1 μg/kg to about 100 mg/kg or more,depending on the factors mentioned above. For repeated administrationsover several days or longer, depending on the condition, the treatmentis sustained until a desired suppression of disease symptoms occurs.However, other dosage regimens may be useful. The progress of thistherapy is easily monitored by conventional techniques and assays. Anexemplary dosing regimen is disclosed in WO 94/04188. The specificationfor the dosage unit forms of the invention are dictated by and directlydependent on the unique characteristics of the active compound and theparticular therapeutic effect to be achieved, and the limitationsinherent in the art of compounding such an active compound for thetreatment of individuals.

The nucleic acid molecules of the invention can be inserted into vectorsand used as gene therapy vectors. Gene therapy vectors can be deliveredto a subject by, for example, intravenous injection, localadministration (U.S. Pat. No. 5,328,470), or by stereotactic injection(see, e.g., Chen et al. (1994) Proc. Natl. Acad. Sci. USA 91:3054-3057).The pharmaceutical preparation of the gene therapy vector can includethe gene therapy vector in an acceptable diluent, or can comprise a slowrelease matrix in which the gene delivery vehicle is imbedded.Alternatively, where the complete gene delivery vector can be producedintact from recombinant cells, e.g., retroviral vectors, thepharmaceutical preparation can include one or more cells which producethe gene delivery system.

The pharmaceutical compositions can be included in a container, pack, ordispenser together with instructions for administration.

As defined herein, a therapeutically effective amount of protein orpolypeptide (i.e., an effective dosage) ranges from about 0.001 to 30mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, morepreferably about 0.1 to 20 mg/kg body weight, and even more preferablyabout 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6mg/kg body weight.

The skilled artisan will appreciate that certain factors may influencethe dosage required to effectively treat a subject, including but notlimited to the severity of the disease or disorder, previous treatments,the general health and/or age of the subject, and other diseasespresent. Moreover, treatment of a subject with a therapeuticallyeffective amount of a protein, polypeptide, or antibody can include asingle treatment or, preferably, can include a series of treatments. Ina preferred example, a subject is treated with antibody, protein, orpolypeptide in the range of between about 0.1 to 20 mg/kg body weight,one time per week for between about 1 to 10 weeks, preferably between 2to 8 weeks, more preferably between about 3 to 7 weeks, and even morepreferably for about 4, 5, or 6 weeks. It will also be appreciated thatthe effective dosage of antibody, protein, or polypeptide used fortreatment may increase or decrease over the course of a particulartreatment. Changes in dosage may result and become apparent from theresults of diagnostic assays as described herein.

The present invention encompasses agents which modulate expression oractivity. An agent may, for example, be a small molecule. For example,such small molecules include, but are not limited to, peptides,peptidomimetics, amino acids, amino acid analogs, polynucleotides,polynucleotide analogs, nucleotides, nucleotide analogs, organic orinorganic compounds (i.e., including heteroorganic and organometalliccompounds) having a molecular weight less than about 10,000 grams permole, organic or inorganic compounds having a molecular weight less thanabout 5,000 grams per mole, organic or inorganic compounds having amolecular weight less than about 1,000 grams per mole, organic orinorganic compounds having a molecular weight less than about 500 gramsper mole, and salts, esters, and other pharmaceutically acceptable formsof such compounds.

It is understood that appropriate doses of small molecule agents dependsupon a number of factors within the ken of the ordinarily skilledphysician, veterinarian, or researcher. The dose(s) of the smallmolecule will vary, for example, depending upon the identity, size, andcondition of the subject or sample being treated, further depending uponthe route by which the composition is to be administered, if applicable,and the effect which the practitioner desires the small molecule to haveupon the nucleic acid or polypeptide of the invention. Exemplary dosesinclude milligram or microgram amounts of the small molecule perkilogram of subject or sample weight (e.g., about 1 microgram perkilogram to about 500 milligrams per kilogram, about 100 micrograms perkilogram to about 5 milligrams per kilogram, or about 1 microgram perkilogram to about 50 micrograms per kilogram. It is furthermoreunderstood that appropriate doses of a small molecule depend upon thepotency of the small molecule with respect to the expression or activityto be modulated. Such appropriate doses may be determined using theassays described herein. When one or more of these small molecules is tobe administered to an animal (e.g., a human) in order to modulateexpression or activity of a polypeptide or nucleic acid of theinvention, a physician, veterinarian, or researcher may, for example,prescribe a relatively low dose at first, subsequently increasing thedose until an appropriate response is obtained. In addition, it isunderstood that the specific dose level for any particular animalsubject will depend upon a variety of factors including the activity ofthe specific compound employed, the age, body weight, general health,gender, and diet of the subject, the time of administration, the routeof administration, the rate of excretion, any drug combination, and thedegree of expression or activity to be modulated.

Computer Readable Means

The nucleotide or amino acid sequences of the invention are alsoprovided in a variety of mediums to facilitate use thereof. As usedherein, “provided” refers to a manufacture, other than an isolatednucleic acid or amino acid molecule, which contains a nucleotide oramino acid sequence of the present invention. Such a manufactureprovides the nucleotide or amino acid sequences, or a subset thereof(e.g., a subset of open reading frames (ORFs)) in a form which allows askilled artisan to examine the manufacture using means not directlyapplicable to examining the nucleotide or amino acid sequences, or asubset thereof, as they exists in nature or in purified form.

In one application of this embodiment, a nucleotide or amino acidsequence of the present invention can be recorded on computer readablemedia. As used herein, “computer readable media” refers to any mediumthat can be read and accessed directly by a computer. Such mediainclude, but are not limited to: magnetic storage media, such as floppydiscs, hard disc storage medium, and magnetic tape; optical storagemedia such as CD-ROM; electrical storage media such as RAM and ROM; andhybrids of these categories such as magnetic/optical storage media. Theskilled artisan will readily appreciate how any of the presently knowncomputer readable mediums can be used to create a manufacture comprisingcomputer readable medium having recorded thereon a nucleotide or aminoacid sequence of the present invention.

As used herein, “recorded” refers to a process for storing informationon computer readable medium. The skilled artisan can readily adopt anyof the presently known methods for recording information on computerreadable medium to generate manufactures comprising the nucleotide oramino acid sequence information of the present invention.

A variety of data storage structures are available to a skilled artisanfor creating a computer readable medium having recorded thereon anucleotide or amino acid sequence of the present invention. The choiceof the data storage structure will generally be based on the meanschosen to access the stored information. In addition, a variety of dataprocessor programs and formats can be used to store the nucleotidesequence information of the present invention on computer readablemedium. The sequence information can be represented in a word processingtext file, formatted in commercially-available software such asWordPerfect and MicroSoft Word, or represented in the form of an ASCIIfile, stored in a database application, such as DB2, Sybase, Oracle, orthe like. The skilled artisan can readily adapt any number ofdataprocessor structuring formats (e.g., text file or database) in orderto obtain computer readable medium having recorded thereon thenucleotide sequence information of the present invention.

By providing the nucleotide or amino acid sequences of the invention incomputer readable form, the skilled artisan can routinely access thesequence information for a variety of purposes. For example, one skilledin the art can use the nucleotide or amino acid sequences of theinvention in computer readable form to compare a target sequence ortarget structural motif with the sequence information stored within thedata storage means. Search means are used to identify fragments orregions of the sequences of the invention which match a particulartarget sequence or target motif.

As used herein, a “target sequence” can be any DNA or amino acidsequence of six or more nucleotides or two or more amino acids. Askilled artisan can readily recognize that the longer a target sequenceis, the less likely a target sequence will be present as a randomoccurrence in the database. The most preferred sequence length of atarget sequence is from about 10 to 100 amino acids or from about 30 to300 nucleotide residues. However, it is well recognized thatcommercially important fragments, such as sequence fragments involved ingene expression and protein processing, may be of shorter length.

As used herein, “a target structural motif,” or “target motif,” refersto any rationally selected sequence or combination of sequences in whichthe sequence(s) are chosen based on a three-dimensional configurationwhich is formed upon the folding of the target motif. There are avariety of target motifs known in the art. Protein target motifsinclude, but are not limited to, enzyme active sites and signalsequences. Nucleic acid target motifs include, but are not limited to,promoter sequences, hairpin structures and inducible expression elements(protein binding sequences).

Computer software is publicly available which allows a skilled artisanto access sequence information provided in a computer readable mediumfor analysis and comparison to other sequences. A variety of knownalgorithms are disclosed publicly and a variety of commerciallyavailable software for conducting search means are and can be used inthe computer-based systems of the present invention. Examples of suchsoftware includes, but is not limited to, MacPattern (EMBL), BLASTN andBLASTX (NCBIA).

For example, software which implements the BLAST (Altschul et al. (1990)J. Mol. Biol. 215:403-410) and BLAZE (Brutlag et al. (1993) Comp. Chem.17:203-207) search algorithms on a Sybase system can be used to identifyopen reading frames (ORFs) of the sequences of the invention whichcontain homology to ORFs or proteins from other libraries. Such ORFs areprotein encoding fragments and are useful in producing commerciallyimportant proteins such as enzymes used in various reactions and in theproduction of commercially useful metabolites.

V. Uses and Methods of the Invention

The nucleic acid molecules, proteins, protein homologues, and antibodiesdescribed herein can be used in one or more of the following methods:(a) screening assays; (b) detection assays (e.g., chromosomal mapping,tissue typing, forensic biology); (c) predictive medicine (e.g.,diagnostic assays, prognostic assays, monitoring clinical trials, andpharmacogenomics); and (d) methods of treatment (e.g., therapeutic andprophylactic). The isolated nucleic acid molecules of the invention canbe used to express adenylate kinase protein (e.g., via a recombinantexpression vector in a host cell in gene therapy applications), todetect adenylate kinase mRNA (e.g., in a biological sample) or a geneticlesion in an adenylate kinase gene, and to modulate adenylate kinaseactivity. In addition, the adenylate kinase proteins can be used toscreen drugs or compounds that modulate the immune response as well asto treat disorders characterized by insufficient or excessive productionof adenylate kinase protein or production of adenylate kinase proteinforms that have decreased or aberrant activity compared to adenylatekinase wild type protein. In addition, the anti-adenylate kinaseantibodies of the invention can be used to detect and isolate adenylatekinase proteins and modulate adenylate kinase activity.

A. Screening Assays

The invention provides a method (also referred to herein as a “screeningassay”) for identifying modulators, i.e., candidate or test compounds oragents (e.g., peptides, peptidomimetics, small molecules, or otherdrugs) that bind to adenylate kinase proteins or have a stimulatory orinhibitory effect on, for example, adenylate kinase expression oradenylate kinase activity.

The test compounds of the present invention can be obtained using any ofthe numerous approaches in combinatorial library methods known in theart, including biological libraries, spatially addressable parallelsolid phase or solution phase libraries, synthetic library methodsrequiring deconvolution, the “one-bead one-compound” library method, andsynthetic library methods using affinity chromatography selection. Thebiological library approach is limited to peptide libraries, while theother four approaches are applicable to peptide, nonpeptide oligomer, orsmall molecule libraries of compounds (Lam (1997) Anticancer Drug Des.12:145).

Examples of methods for the synthesis of molecular libraries can befound in the art, for example in: DeWitt et al. (1993) Proc. Natl. Acad.Sci. USA 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422;Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993)Science 261:1303; Carrell et al. (1994) Angew. Chem. Int. Ed. Engl.33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; andGallop et al. (1994) J. Med. Chem. 37:1233.

Libraries of compounds may be presented in solution (e.g., Houghten(1992) Bio/Techniques 13:412-421), or on beads (Lam (1991) Nature354:82-84), chips (Fodor (1993) Nature 364:555-556), bacteria (U.S. Pat.No. 5,223,409), spores (U.S. Pat. Nos. 5,571,698; 5,403,484; and5,223,409), plasmids (Cull et al. (1992) Proc. Natl. Acad. Sci. USA89:1865-1869), or phage (Scott and Smith (1990) Science 249:386-390;Devlin (1990) Science 249:404-406; Cwirla et al. (1990) Proc. Natl.Acad. Sci. USA 87:6378-6382; and Felici (1991) J. Mol. Biol.222:301-310).

Determining the ability of the test compound to bind to the adenylatekinase protein can be accomplished, for example, by coupling the testcompound with a radioisotope or enzymatic label such that binding of thetest compound to the adenylate kinase protein or biologically activeportion thereof can be determined by detecting the labeled compound in acomplex. For example, test compounds can be labeled with ¹²⁵I, ³⁵S, ¹⁴C,or ³H, either directly or indirectly, and the radioisotope detected bydirect counting of radioemmission or by scintillation counting.Alternatively, test compounds can be enzymatically labeled with, forexample, horseradish peroxidase, alkaline phosphatase, or luciferase,and the enzymatic label detected by determination of conversion of anappropriate substrate to product.

In a similar manner, one may determine the ability of the adenylatekinase protein to bind to or interact with an adenylate kinase targetmolecule. By “target molecule” is intended a molecule with which anadenylate kinase protein binds or interacts in nature. In a preferredembodiment, the ability of the adenylate kinase protein to bind to orinteract with an adenylate kinase target molecule can be determined bymonitoring the activity of the target molecule. For example, theactivity of the target molecule can be monitored by detecting inductionof a cellular second messenger of the target (e.g., intracellular Ca²⁺,diacylglycerol, IP3, etc.), detecting catalytic/enzymatic activity ofthe target on an appropriate substrate, detecting the induction of areporter gene (e.g., an adenylate kinase-responsive regulatory elementoperably linked to a nucleic acid encoding a detectable marker, e.g.luciferase), or detecting a cellular response, for example, cellulardifferentiation or cell proliferation.

In yet another embodiment, an assay of the present invention is acell-free assay comprising contacting an adenylate kinase protein orbiologically active portion thereof with a test compound and determiningthe ability of the test compound to bind to the adenylate kinase proteinor biologically active portion thereof. Binding of the test compound tothe adenylate kinase protein can be determined either directly orindirectly as described above. In a preferred embodiment, the assayincludes contacting the adenylate kinase protein or biologically activeportion thereof with a known compound that binds adenylate kinaseprotein to form an assay mixture, contacting the assay mixture with atest compound, and determining the ability of the test compound topreferentially bind to adenylate kinase protein or biologically activeportion thereof as compared to the known compound.

In another embodiment, an assay is a cell-free assay comprisingcontacting adenylate kinase protein or biologically active portionthereof with a test compound and determining the ability of the testcompound to modulate (e.g., stimulate or inhibit) the activity of theadenylate kinase protein or biologically active portion thereof.Determining the ability of the test compound to modulate the activity ofan adenylate kinase protein can be accomplished, for example, bydetermining the ability of the adenylate kinase protein to bind to anadenylate kinase target molecule as described above for determiningdirect binding. In an alternative embodiment, determining the ability ofthe test compound to modulate the activity of an adenylate kinaseprotein can be accomplished by determining the ability of the adenylatekinase protein to further modulate an adenylate kinase target molecule.For example, the catalytic/enzymatic activity of the target molecule onan appropriate substrate can be determined as previously described.

In yet another embodiment, the cell-free assay comprises contacting theadenylate kinase protein or biologically active portion thereof with aknown compound that binds an adenylate kinase protein to form an assaymixture, contacting the assay mixture with a test compound, anddetermining the ability of the test compound to preferentially bind toor modulate the activity of an adenylate kinase target molecule.

In the above-mentioned assays, it may be desirable to immobilize eitheran adenylate kinase protein or its target molecule to facilitateseparation of complexed from uncomplexed forms of one or both of theproteins, as well as to accommodate automation of the assay. In oneembodiment, a fusion protein can be provided that adds a domain thatallows one or both of the proteins to be bound to a matrix. For example,glutathione-S-transferase/adenylate kinase fusion proteins orglutathione-S-transferase/target fusion proteins can be adsorbed ontoglutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) orglutathione-derivatized microtitre plates, which are then combined withthe test compound or the test compound and either the nonadsorbed targetprotein or adenylate kinase protein, and the mixture incubated underconditions conducive to complex formation (e.g., at physiologicalconditions for salt and pH). Following incubation, the beads ormicrotitre plate wells are washed to remove any unbound components andcomplex formation is measured either directly or indirectly, forexample, as described above. Alternatively, the complexes can bedissociated from the matrix, and the level of adenylate kinase bindingor activity determined using standard techniques.

Other techniques for immobilizing proteins on matrices can also be usedin the screening assays of the invention. For example, either adenylatekinase protein or its target molecule can be immobilized utilizingconjugation of biotin and streptavidin. Biotinylated adenylate kinasemolecules or target molecules can be prepared from biotin-NHS(N-hydroxy-succinimide) using techniques well known in the art (e.g.,biotinylation kit, Pierce Chemicals, Rockford, Ill.), and immobilized inthe wells of streptavidin-coated 96-well plates (Pierce Chemicals).Alternatively, antibodies reactive with an adenylate kinase protein ortarget molecules but which do not interfere with binding of theadenylate kinase protein to its target molecule can be derivatized tothe wells of the plate, and unbound target or adenylate kinase proteintrapped in the wells by antibody conjugation. Methods for detecting suchcomplexes, in addition to those described above for the GST-immobilizedcomplexes, include immunodetection of complexes using antibodiesreactive with the adenylate kinase protein or target molecule, as wellas enzyme-linked assays that rely on detecting an enzymatic activityassociated with the adenylate kinase protein or target molecule.

In another embodiment, modulators of adenylate kinase expression areidentified in a method in which a cell is contacted with a candidatecompound and the expression of adenylate kinase mRNA or protein in thecell is determined relative to expression of adenylate kinase mRNA orprotein in a cell in the absence of the candidate compound. Whenexpression is greater (statistically significantly greater) in thepresence of the candidate compound than in its absence, the candidatecompound is identified as a stimulator of adenylate kinase mRNA orprotein expression. Alternatively, when expression is less(statistically significantly less) in the presence of the candidatecompound than in its absence, the candidate compound is identified as aninhibitor of adenylate kinase mRNA or protein expression. The level ofadenylate kinase mRNA or protein expression in the cells can bedetermined by methods described herein for detecting adenylate kinasemRNA or protein.

In yet another aspect of the invention, the adenylate kinase proteinscan be used as “bait proteins” in a two-hybrid assay or three-hybridassay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartelet al. (1993) Bio/Techniques 14:920-924; Iwabuchi et al. (1993) Oncogene8:1693-1696; and PCT Publication No. WO 94/10300), to identify otherproteins, which bind to or interact with adenylate kinase protein(“adenylate kinase-binding proteins” or “adenylate kinase-bp”) andmodulate adenylate kinase activity. Such adenylate kinase-bindingproteins are also likely to be involved in the propagation of signals bythe adenylate kinase proteins as, for example, upstream or downstreamelements of the adenylate kinase pathway.

This invention further pertains to novel agents identified by theabove-described screening assays and uses thereof for treatments asdescribed herein. Accordingly the invention is directed to agents thatmodulate the level or activity of the polypeptide or nucleic acid of theinvention, the agents being identified by screening cells, tissues, cellextracts, or tissue extracts with the agents. Agents that alter thelevel or activity can then be tested further for clinical diagnostic ortherapeutic use. Any method of screening that allows expression to bemeasured, such as those disclosed herein, are relevant to produce theidentification of these agents. Thus, the invention is directed toagents identified by the screening processes involving measuring ordetecting expression (level or activity) of the polypeptides or nucleicacids of the invention. It is understood that agents affecting theability of the protein or nucleic acid to interact with a cellularcomponent, as in competition binding, would be construed as affectingexpression. Accordingly, screening processes also include assays foragents that themselves bind to the protein or nucleic acid of theinvention, such as those disclosed herein.

B. Detection Assays

Portions or fragments of the cDNA sequences identified herein (and thecorresponding complete gene sequences) can be used in numerous ways aspolynucleotide reagents. For example, these sequences can be used to:(1) map their respective genes on a chromosome; (2) identify anindividual from a minute biological sample (tissue typing); and (3) aidin forensic identification of a biological sample. These applicationsare described in the subsections below.

1. Chromosome Mapping

The isolated complete or partial adenylate kinase gene sequences of theinvention can be used to map their respective adenylate kinase genes ona chromosome, thereby facilitating the location of gene regionsassociated with genetic disease. Computer analysis of adenylate kinasesequences can be used to rapidly select PCR primers (preferably 15-25 bpin length) that do not span more than one exon in the genomic DNA,thereby simplifying the amplification process. These primers can then beused for PCR screening of somatic cell hybrids containing individualhuman chromosomes. Only those hybrids containing the human genecorresponding to the adenylate kinase sequences will yield an amplifiedfragment.

Somatic cell hybrids are prepared by fusing somatic cells from differentmammals (e.g., human and mouse cells). As hybrids of human and mousecells grow and divide, they gradually lose human chromosomes in randomorder, but retain the mouse chromosomes. By using media in which mousecells cannot grow (because they lack a particular enzyme), but in whichhuman cells can, the one human chromosome that contains the geneencoding the needed enzyme will be retained. By using various media,panels of hybrid cell lines can be established. Each cell line in apanel contains either a single human chromosome or a small number ofhuman chromosomes, and a full set of mouse chromosomes, allowing easymapping of individual genes to specific human chromosomes (D'Eustachioet al. (1983) Science 220:919-924). Somatic cell hybrids containing onlyfragments of human chromosomes can also be produced by using humanchromosomes with translocations and deletions.

Other mapping strategies that can similarly be used to map an adenylatekinase sequence to its chromosome include in situ hybridization(described in Fan et al. (1990) Proc. Natl. Acad. Sci. USA 87:6223-27),pre-screening with labeled flow-sorted chromosomes, and pre-selection byhybridization to chromosome specific cDNA libraries. Furthermore,fluorescence in situ hybridization (FISH) of a DNA sequence to ametaphase chromosomal spread can be used to provide a precisechromosomal location in one step. For a review of this technique, seeVerma eta a. (1988) Human Chromosomes: A Manual of Basic Techniques(Pergamon Press, NY). The FISH technique can be used with a DNA sequenceas short as 500 or 600 bases. However, clones larger than 1,000 baseshave a higher likelihood of binding to a unique chromosomal locationwith sufficient signal intensity for simple detection. Preferably 1,000bases, and more preferably 2,000 bases will suffice to get good resultsin a reasonable amount of time.

Reagents for chromosome mapping can be used individually to mark asingle chromosome or a single site on that chromosome, or panels ofreagents can be used for marking multiple sites and/or multiplechromosomes. Reagents corresponding to noncoding regions of the genesactually are preferred for mapping purposes. Coding sequences are morelikely to be conserved within gene families, thus increasing the chanceof cross hybridizations during chromosomal mapping.

Once a sequence has been mapped to a precise chromosomal location, thephysical position of the sequence on the chromosome can be correlatedwith genetic map data. (Such data are found, for example, in V.McKusick, Mendelian Inheritance in Man, available on-line through JohnsHopkins University Welch Medical Library). The relationship betweengenes and disease, mapped to the same chromosomal region, can then beidentified through linkage analysis (co-inheritance of physicallyadjacent genes), described in, e.g., Egeland et al. (1987) Nature325:783-787.

Moreover, differences in the DNA sequences between individuals affectedand unaffected with a disease associated with the adenylate kinase genecan be determined. If a mutation is observed in some or all of theaffected individuals but not in any unaffected individuals, then themutation is likely to be the causative agent of the particular disease.Comparison of affected and unaffected individuals generally involvesfirst looking for structural alterations in the chromosomes such asdeletions or translocations that are visible from chromosome spreads ordetectable using PCR based on that DNA sequence. Ultimately, completesequencing of genes from several individuals can be performed to confirmthe presence of a mutation and to distinguish mutations frompolymorphisms.

2. Tissue Typing

The adenylate kinase sequences of the present invention can also be usedto identify individuals from minute biological samples. The UnitedStates military, for example, is considering the use of restrictionfragment length polymorphism (RFLP) for identification of its personnel.In this technique, an individual's genomic DNA is digested with one ormore restriction enzymes and probed on a Southern blot to yield uniquebands for identification. The sequences of the present invention areuseful as additional DNA markers for RFLP (described in U.S. Pat. No.5,272,057).

Furthermore, the sequences of the present invention can be used toprovide an alternative technique for determining the actual base-by-baseDNA sequence of selected portions of an individual's genome. Thus, theadenylate kinase sequences of the invention can be used to prepare twoPCR primers from the 5N and 3N ends of the sequences. These primers canthen be used to amplify an individual's DNA and subsequently sequenceit.

Panels of corresponding DNA sequences from individuals, prepared in thismanner, can provide unique individual identifications, as eachindividual will have a unique set of such DNA sequences due to allelicdifferences. The adenylate kinase sequences of the invention uniquelyrepresent portions of the human genome. Allelic variation occurs to somedegree in the coding regions of these sequences, and to a greater degreein the noncoding regions. It is estimated that allelic variation betweenindividual humans occurs with a frequency of about once per each 500bases. Each of the sequences described herein can, to some degree, beused as a standard against which DNA from an individual can be comparedfor identification purposes. The noncoding sequences of SEQ ID NO:1 cancomfortably provide positive individual identification with a panel ofperhaps 10 to 1,000 primers that each yield a noncoding amplifiedsequence of 100 bases. If a predicted coding sequence, such as that inSEQ ID NO:1, is used, a more appropriate number of primers for positiveindividual identification would be 500 to 2,000.

3. Use of Partial Adenylate Kinase Sequences in Forensic Biology

DNA-based identification techniques can also be used in forensicbiology. In this manner, PCR technology can be used to amplify DNAsequences taken from very small biological samples such as tissues,e.g., hair or skin, or body fluids, e.g., blood, saliva, or semen foundat a crime scene. The amplified sequence can then be compared to astandard, thereby allowing identification of the origin of thebiological sample.

The sequences of the present invention can be used to providepolynucleotide reagents, e.g., PCR primers, targeted to specific loci inthe human genome, which can enhance the reliability of DNA-basedforensic identifications by, for example, providing another“identification marker” that is unique to a particular individual. Asmentioned above, actual base sequence information can be used foridentification as an accurate alternative to patterns formed byrestriction enzyme generated fragments. Sequences targeted to noncodingregions of SEQ ID NO:1 are particularly appropriate for this use asgreater numbers of polymorphisms occur in the noncoding regions, makingit easier to differentiate individuals using this technique. Examples ofpolynucleotide reagents include the adenylate kinase sequences orportions thereof, e.g., fragments derived from the noncoding regions ofSEQ ID NO:1 having a length of at least 20 or 30 bases.

The adenylate kinase sequences described herein can further be used toprovide polynucleotide reagents, e.g., labeled or labelable probes thatcan be used in, for example, an in situ hybridization technique, toidentify a specific tissue. This can be very useful in cases where aforensic pathologist is presented with a tissue of unknown origin.Panels of such adenylate kinase probes, can be used to identify tissueby species and/or by organ type.

In a similar fashion, these reagents, e.g., adenylate kinase primers orprobes can be used to screen tissue culture for contamination (i.e.,screen for the presence of a mixture of different types of cells in aculture).

C. Predictive Medicine

The present invention also pertains to the field of predictive medicinein which diagnostic assays, prognostic assays, pharmacogenomics, andmonitoring clinical trails are used for prognostic (predictive) purposesto thereby treat an individual prophylactically. These applications aredescribed in the subsections below.

1. Diagnostic Assays

One aspect of the present invention relates to diagnostic assays fordetecting adenylate kinase protein and/or nucleic acid expression aswell as adenylate kinase activity, in the context of a biologicalsample. An exemplary method for detecting the presence or absence ofadenylate kinase proteins in a biological sample involves obtaining abiological sample from a test subject and contacting the biologicalsample with a compound or an agent capable of detecting adenylate kinaseprotein or nucleic acid (e.g., mRNA, genomic DNA) that encodes adenylatekinase protein such that the presence of adenylate kinase protein isdetected in the biological sample. Results obtained with a biologicalsample from the test subject may be compared to results obtained with abiological sample from a control subject.

A preferred agent for detecting adenylate kinase mRNA or genomic DNA isa labeled nucleic acid probe capable of hybridizing to adenylate kinasemRNA or genomic DNA. The nucleic acid probe can be, for example, afull-length adenylate kinase nucleic acid, such as the nucleic acid ofSEQ ID NO:1, or a portion thereof, such as a nucleic acid molecule of atleast 15, 30, 50, 100, 250, or 500 nucleotides in length and sufficientto specifically hybridize under stringent conditions to adenylate kinasemRNA or genomic DNA. Other suitable probes for use in the diagnosticassays of the invention are described herein.

A preferred agent for detecting adenylate kinase protein is an antibodycapable of binding to adenylate kinase protein, preferably an antibodywith a detectable label. Antibodies can be polyclonal, or morepreferably, monoclonal. An intact antibody, or a fragment thereof (e.g.,Fab or F(abN)₂) can be used. The term “labeled”, with regard to theprobe or antibody, is intended to encompass direct labeling of the probeor antibody by coupling (i.e., physically linking) a detectablesubstance to the probe or antibody, as well as indirect labeling of theprobe or antibody by reactivity with another reagent that is directlylabeled. Examples of indirect labeling include detection of a primaryantibody using a fluorescently labeled secondary antibody andend-labeling of a DNA probe with biotin such that it can be detectedwith fluorescently labeled streptavidin.

The term “biological sample” is intended to include tissues, cells, andbiological fluids isolated from a subject, as well as tissues, cells,and fluids present within a subject. That is, the detection method ofthe invention can be used to detect adenylate kinase mRNA, protein, orgenomic DNA in a biological sample in vitro as well as in vivo. Forexample, in vitro techniques for detection of adenylate kinase mRNAinclude Northern hybridizations and in situ hybridizations. In vitrotechniques for detection of adenylate kinase protein include enzymelinked immunosorbent assays (ELISAs), Western blots,immunoprecipitations, and immunofluorescence. In vitro techniques fordetection of adenylate kinase genomic DNA include Southernhybridizations. Furthermore, in vivo techniques for detection ofadenylate kinase protein include introducing into a subject a labeledanti-adenylate kinase antibody. For example, the antibody can be labeledwith a radioactive marker whose presence and location in a subject canbe detected by standard imaging techniques.

In one embodiment, the biological sample contains protein molecules fromthe test subject. Alternatively, the biological sample can contain mRNAmolecules from the test subject or genomic DNA molecules from the testsubject. A preferred biological sample is a peripheral blood leukocytesample isolated by conventional means from a subject.

The invention also encompasses kits for detecting the presence ofadenylate kinase proteins in a biological sample (a test sample). Suchkits can be used to determine if a subject is suffering from or is atincreased risk of developing a disorder associated with aberrantexpression of adenylate kinase protein (e.g., an immunologicaldisorder). For example, the kit can comprise a labeled compound or agentcapable of detecting adenylate kinase protein or mRNA in a biologicalsample and means for determining the amount of an adenylate kinaseprotein in the sample (e.g., an anti-adenylate kinase antibody or anoligonucleotide probe that binds to DNA encoding an adenylate kinaseprotein, e.g., SEQ ID NO:1). Kits can also include instructions forobserving that the tested subject is suffering from or is at risk ofdeveloping a disorder associated with aberrant expression of adenylatekinase sequences if the amount of adenylate kinase protein or mRNA isabove or below a normal level.

For antibody-based kits, the kit can comprise, for example: (1) a firstantibody (e.g., attached to a solid support) that binds to adenylatekinase protein; and, optionally, (2) a second, different antibody thatbinds to adenylate kinase protein or the first antibody and isconjugated to a detectable agent. For oligonucleotide-based kits, thekit can comprise, for example: (1) an oligonucleotide, e.g., adetectably labeled oligonucleotide, that hybridizes to an adenylatekinase nucleic acid sequence or (2) a pair of primers useful foramplifying an adenylate kinase nucleic acid molecule.

The kit can also comprise, e.g., a buffering agent, a preservative, or aprotein stabilizing agent. The kit can also comprise componentsnecessary for detecting the detectable agent (e.g., an enzyme or asubstrate). The kit can also contain a control sample or a series ofcontrol samples that can be assayed and compared to the test samplecontained. Each component of the kit is usually enclosed within anindividual container, and all of the various containers are within asingle package along with instructions for observing whether the testedsubject is suffering from or is at risk of developing a disorderassociated with aberrant expression of adenylate kinase proteins.

2. Prognostic Assays

The methods described herein can furthermore be utilized as diagnosticor prognostic assays to identify subjects having or at risk ofdeveloping a disease or disorder associated with adenylate kinaseprotein, adenylate kinase nucleic acid expression, or adenylate kinaseactivity. Prognostic assays can be used for prognostic or predictivepurposes to thereby prophylactically treat an individual prior to theonset of a disorder characterized by or associated with adenylate kinaseprotein, adenylate kinase nucleic acid expression, or adenylate kinaseactivity.

Thus, the present invention provides a method in which a test sample isobtained from a subject, and adenylate kinase protein or nucleic acid(e.g., mRNA, genomic DNA) is detected, wherein the presence of adenylatekinase protein or nucleic acid is diagnostic for a subject having or atrisk of developing a disease or disorder associated with aberrantadenylate kinase expression or activity. As used herein, a “test sample”refers to a biological sample obtained from a subject of interest. Forexample, a test sample can be a biological fluid (e.g., serum), cellsample, or tissue.

Furthermore, using the prognostic assays described herein, the presentinvention provides methods for determining whether a subject can beadministered a specific agent (e.g., an agonist, antagonist,peptidomimetic, protein, peptide, nucleic acid, small molecule, or otherdrug candidate) or class of agents (e.g., agents of a type that decreaseadenylate kinase activity) to effectively treat a disease or disorderassociated with aberrant adenylate kinase expression or activity. Inthis manner, a test sample is obtained and adenylate kinase protein ornucleic acid is detected. The presence of adenylate kinase protein ornucleic acid is diagnostic for a subject that can be administered theagent to treat a disorder associated with aberrant adenylate kinaseexpression or activity.

The methods of the invention can also be used to detect genetic lesionsor mutations in an adenylate kinase gene, thereby determining if asubject with the lesioned gene is at risk for a disorder characterizedby aberrant cell proliferation and/or differentiation. In preferredembodiments, the methods include detecting, in a sample of cells fromthe subject, the presence or absence of a genetic lesion or mutationcharacterized by at least one of an alteration affecting the integrityof a gene encoding an adenylate kinase-protein, or the misexpression ofthe adenylate kinase gene. For example, such genetic lesions ormutations can be detected by ascertaining the existence of at least oneof: (1) a deletion of one or more nucleotides from an adenylate kinasegene; (2) an addition of one or more nucleotides to an adenylate kinasegene; (3) a substitution of one or more nucleotides of an adenylatekinase gene; (4) a chromosomal rearrangement of an adenylate kinasegene; (5) an alteration in the level of a messenger RNA transcript of anadenylate kinase gene; (6) an aberrant modification of an adenylatekinase gene, such as of the methylation pattern of the genomic DNA; (7)the presence of a non-wild-type splicing pattern of a messenger RNAtranscript of an adenylate kinase gene; (8) a non-wild-type level of anadenylate kinase-protein; (9) an allelic loss of an adenylate kinasegene; and (10) an inappropriate post-translational modification of anadenylate kinase-protein. As described herein, there are a large numberof assay techniques known in the art that can be used for detectinglesions in an adenylate kinase gene. Any cell type or tissue, preferablyperipheral blood leukocytes, in which adenylate kinase proteins areexpressed may be utilized in the prognostic assays described herein.

In certain embodiments, detection of the lesion involves the use of aprobe/primer in a polymerase chain reaction (PCR) (see, e.g., U.S. Pat.Nos. 4,683,195 and 4,683,202), such as anchor PCR or RACE PCR, or,alternatively, in a ligation chain reaction (LCR) (see, e.g., Landegranet al. (1988) Science 241:1077-1080; and Nakazawa et al. (1994) Proc.Natl. Acad. Sci. USA 91:360-364), the latter of which can beparticularly useful for detecting point mutations in the adenylatekinase gene (see, e.g., Abravaya et al. (1995) Nucleic Acids Res.23:675-682). It is anticipated that PCR and/or LCR may be desirable touse as a preliminary amplification step in conjunction with any of thetechniques used for detecting mutations described herein.

Alternative amplification methods include self sustained sequencereplication (Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA87:1874-1878), transcriptional amplification system (Kwoh et al. (1989)Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi etal. (1988) Bio/Technology 6:1197), or any other nucleic acidamplification method, followed by the detection of the amplifiedmolecules using techniques well known to those of skill in the art.These detection schemes are especially useful for the detection ofnucleic acid molecules if such molecules are present in very lownumbers.

In an alternative embodiment, mutations in an adenylate kinase gene froma sample cell can be identified by alterations in restriction enzymecleavage patterns of isolated test sample and control DNA digested withone or more restriction endonucleases. Moreover, the use of sequencespecific ribozymes (see, e.g., U.S. Pat. No. 5,498,531) can be used toscore for the presence of specific mutations by development or loss of aribozyme cleavage site.

In other embodiments, genetic mutations in an adenylate kinase moleculecan be identified by hybridizing a sample and control nucleic acids,e.g., DNA or RNA, to high density arrays containing hundreds orthousands of oligonucleotides probes (Cronin et al. (1996) HumanMutation 7:244-255; Kozal et al. (1996) Nature Medicine 2:753-759). Inyet another embodiment, any of a variety of sequencing reactions knownin the art can be used to directly sequence the adenylate kinase geneand detect mutations by comparing the sequence of the sample adenylatekinase gene with the corresponding wild-type (control) sequence.Examples of sequencing reactions include those based on techniquesdeveloped by Maxim and Gilbert ((1977) Proc. Natl. Acad. Sci. USA74:560) or Sanger ((1977) Proc. Natl. Acad. Sci. USA 74:5463). It isalso contemplated that any of a variety of automated sequencingprocedures can be utilized when performing the diagnostic assays ((1995)Bio/Techniques 19:448), including sequencing by mass spectrometry (see,e.g., PCT Publication No. WO 94/16101; Cohen et al. (1996) Adv.Chromatogr. 36:127-162; and Griffin et al. (1993) Appl. Biochem.Biotechnol. 38:147-159).

Other methods for detecting mutations in the adenylate kinase geneinclude methods in which protection from cleavage agents is used todetect mismatched bases in RNA/RNA or RNA/DNA heteroduplexes (Myers etal. (1985) Science 230:1242). See, also Cotton et al. (1988) Proc. Natl.Acad. Sci. USA 85:4397; Saleeba et al. (1992) Methods Enzymol.217:286-295. In a preferred embodiment, the control DNA or RNA can belabeled for detection.

In still another embodiment, the mismatch cleavage reaction employs oneor more “DNA mismatch repair” enzymes that recognize mismatched basepairs in double-stranded DNA in defined systems for detecting andmapping point mutations in adenylate kinase cDNAs obtained from samplesof cells. See, e.g., Hsu et al. (1994) Carcinogenesis 15:1657-1662.According to an exemplary embodiment, a probe based on an adenylatekinase sequence, e.g., a wild-type adenylate kinase sequence, ishybridized to a cDNA or other DNA product from a test cell(s). Theduplex is treated with a DNA mismatch repair enzyme, and the cleavageproducts, if any, can be detected from electrophoresis protocols or thelike. See, e.g., U.S. Pat. No. 5,459,039.

In other embodiments, alterations in electrophoretic mobility will beused to identify mutations in adenylate kinase genes. For example,single-strand conformation polymorphism (SSCP) may be used to detectdifferences in electrophoretic mobility between mutant and wild-typenucleic acids (Orita et al. (1989) Proc. Natl. Acad. Sci. USA 86:2766;see also Cotton (1993) Mutat. Res. 285:125-144; Hayashi (1992) Genet.Anal. Tech. Appl. 9:73-79). The sensitivity of the assay may be enhancedby using RNA (rather than DNA), in which the secondary structure is moresensitive to a change in sequence. In a preferred embodiment, thesubject method utilizes heteroduplex analysis to separatedouble-stranded heteroduplex molecules on the basis of changes inelectrophoretic mobility (Keen et al. (1991) Trends Genet. 7:5).

In yet another embodiment, the movement of mutant or wild-type fragmentsin polyacrylamide gels containing a gradient of denaturant is assayedusing denaturing gradient gel electrophoresis (DGGE) (Myers et al.(1985) Nature 313:495). When DGGE is used as the method of analysis, DNAwill be modified to insure that it does not completely denature, forexample by adding a GC clamp of approximately 40 bp of high-meltingGC-rich DNA by PCR. In a further embodiment, a temperature gradient isused in place of a denaturing gradient to identify differences in themobility of control and sample DNA (Rosenbaum and Reissner (1987)Biophys. Chem. 265:12753).

Examples of other techniques for detecting point mutations include, butare not limited to, selective oligonucleotide hybridization, selectiveamplification, or selective primer extension. For example,oligonucleotide primers may be prepared in which the known mutation isplaced centrally and then hybridized to target DNA under conditions thatpermit hybridization only if a perfect match is found (Saiki et al.(1986) Nature 324:163); Saiki et al. (1989) Proc. Natl. Acad. Sci. USA86:6230). Such allele-specific oligonucleotides are hybridized toPCR-amplified target DNA or a number of different mutations when theoligonucleotides are attached to the hybridizing membrane and hybridizedwith labeled target DNA.

Alternatively, allele-specific amplification technology, which dependson selective PCR amplification, may be used in conjunction with theinstant invention. Oligonucleotides used as primers for specificamplification may carry the mutation of interest in the center of themolecule so that amplification depends on differential hybridization(Gibbs et al. (1989) Nucleic Acids Res. 17:2437-2448) or at the extreme3′ end of one primer where, under appropriate conditions, mismatch canprevent or reduce polymerase extension (Prossner (1993) Tibtech 11:238).In addition, it may be desirable to introduce a novel restriction sitein the region of the mutation to create cleavage-based detection(Gasparini et al. (1992) Mol. Cell. Probes 6:1). It is anticipated thatin certain embodiments amplification may also be performed using Taqligase for amplification (Barany (1991) Proc. Natl. Acad. Sci. USA88:189). In such cases, ligation will occur only if there is a perfectmatch at the 3′ end of the 5′ sequence making it possible to detect thepresence of a known mutation at a specific site by looking for thepresence or absence of amplification.

The methods described herein may be performed, for example, by utilizingprepackaged diagnostic kits comprising at least one probe nucleic acidor antibody reagent described herein, which may be conveniently used,e.g., in clinical settings to diagnosed patients exhibiting symptoms orfamily history of a disease or illness involving an adenylate kinasegene.

3. Pharmacogenomics

Agents, or modulators that have a stimulatory or inhibitory effect onadenylate kinase activity (e.g., adenylate kinase gene expression) asidentified by a screening assay described herein, can be administered toindividuals to treat (prophylactically or therapeutically) disordersassociated with aberrant adenylate kinase activity as well as tomodulate the phenotype of an immune response. In conjunction with suchtreatment, the pharmacogenomics (i.e., the study of the relationshipbetween an individual's genotype and that individual's response to aforeign compound or drug) of the individual may be considered.Differences in metabolism of therapeutics can lead to severe toxicity ortherapeutic failure by altering the relation between dose and bloodconcentration of the pharmacologically active drug. Thus, thepharmacogenomics of the individual permits the selection of effectiveagents (e.g., drugs) for prophylactic or therapeutic treatments based ona consideration of the individual's genotype. Such pharmacogenomics canfurther be used to determine appropriate dosages and therapeuticregimens. Accordingly, the activity of adenylate kinase protein,expression of adenylate kinase nucleic acid, or mutation content ofadenylate kinase genes in an individual can be determined to therebyselect appropriate agent(s) for therapeutic or prophylactic treatment ofthe individual.

Pharmacogenomics deals with clinically significant hereditary variationsin the response to drugs due to altered drug disposition and abnormalaction in affected persons. See, e.g., Linder (1997) Clin. Chem.43(2):254-266. In general, two types of pharmacogenetic conditions canbe differentiated. Genetic conditions transmitted as a single factoraltering the way drugs act on the body are referred to as “altered drugaction.” Genetic conditions transmitted as single factors altering theway the body acts on drugs are referred to as “altered drug metabolism”.These pharmacogenetic conditions can occur either as rare defects or aspolymorphisms. For example, glucose-6-phosphate dehydrogenase deficiency(G6PD) is a common inherited enzymopathy in which the main clinicalcomplication is haemolysis after ingestion of oxidant drugs(antimalarials, sulfonamides, analgesics, nitrofurans) and consumptionof fava beans.

As an illustrative embodiment, the activity of drug metabolizing enzymesis a major determinant of both the intensity and duration of drugaction. The discovery of genetic polymorphisms of drug metabolizingenzymes (e.g., N-acetyltransferase 2 (NAT 2) and cytochrome P450 enzymesCYP2D6 and CYP2C19) has provided an explanation as to why some patientsdo not obtain the expected drug effects or show exaggerated drugresponse and serious toxicity after taking the standard and safe dose ofa drug. These polymorphisms are expressed in two phenotypes in thepopulation, the extensive metabolizer (EM) and poor metabolizer (PM).The prevalence of PM is different among different populations. Forexample, the gene coding for CYP2D6 is highly polymorphic and severalmutations have been identified in PM, which all lead to the absence offunctional CYP2D6. Poor metabolizers of CYP2D6 and CYP2C19 quitefrequently experience exaggerated drug response and side effects whenthey receive standard doses. If a metabolite is the active therapeuticmoiety, a PM will show no therapeutic response, as demonstrated for theanalgesic effect of codeine mediated by its CYP2D6-formed metabolitemorphine. The other extreme are the so called ultra-rapid metabolizerswho do not respond to standard doses. Recently, the molecular basis ofultra-rapid metabolism has been identified to be due to CYP2D6 geneamplification.

Thus, the activity of adenylate kinase protein, expression of adenylatekinase nucleic acid, or mutation content of adenylate kinase genes in anindividual can be determined to thereby select appropriate agent(s) fortherapeutic or prophylactic treatment of the individual. In addition,pharmacogenetic studies can be used to apply genotyping of polymorphicalleles encoding drug-metabolizing enzymes to the identification of anindividual's drug responsiveness phenotype. This knowledge, when appliedto dosing or drug selection, can avoid adverse reactions or therapeuticfailure and thus enhance therapeutic or prophylactic efficiency whentreating a subject with an adenylate kinase modulator, such as amodulator identified by one of the exemplary screening assays describedherein.

4. Monitoring of Effects During Clinical Trials

Monitoring the influence of agents (e.g., drugs, compounds) on theexpression or activity of adenylate kinase genes (e.g., the ability tomodulate aberrant cell proliferation and/or differentiation) can beapplied not only in basic drug screening but also in clinical trials.For example, the effectiveness of an agent, as determined by a screeningassay as described herein, to increase or decrease adenylate kinase geneexpression, protein levels, or protein activity, can be monitored inclinical trials of subjects exhibiting decreased or increased adenylatekinase gene expression, protein levels, or protein activity. In suchclinical trials, adenylate kinase expression or activity and preferablythat of other genes that have been implicated in for example, a cellularproliferation disorder, can be used as a marker of the immuneresponsiveness of a particular cell.

For example, and not by way of limitation, genes that are modulated incells by treatment with an agent (e.g., compound, drug, or smallmolecule) that modulates adenylate kinase activity (e.g., as identifiedin a screening assay described herein) can be identified. Thus, to studythe effect of agents on cellular proliferation disorders, for example,in a clinical trial, cells can be isolated and RNA prepared and analyzedfor the levels of expression of adenylate kinase genes and other genesimplicated in the disorder. The levels of gene expression (i.e., a geneexpression pattern) can be quantified by Northern blot analysis orRT-PCR, as described herein, or alternatively by measuring the amount ofprotein produced, by one of the methods as described herein, or bymeasuring the levels of activity of adenylate kinase genes or othergenes. In this way, the gene expression pattern can serve as a marker,indicative of the physiological response of the cells to the agent.Accordingly, this response state may be determined before, and atvarious points during, treatment of the individual with the agent.

In a preferred embodiment, the present invention provides a method formonitoring the effectiveness of treatment of a subject with an agent(e.g., an agonist, antagonist, peptidomimetic, protein, peptide, nucleicacid, small molecule, or other drug candidate identified by thescreening assays described herein) comprising the steps of (1) obtaininga preadministration sample from a subject prior to administration of theagent; (2) detecting the level of expression of an adenylate kinaseprotein, mRNA, or genomic DNA in the preadministration sample; (3)obtaining one or more postadministration samples from the subject; (4)detecting the level of expression or activity of the adenylate kinaseprotein, mRNA, or genomic DNA in the postadministration samples; (5)comparing the level of expression or activity of the adenylate kinaseprotein, mRNA, or genomic DNA in the preadministration sample with theadenylate kinase protein, mRNA, or genomic DNA in the postadministrationsample or samples; and (vi) altering the administration of the agent tothe subject accordingly to bring about the desired effect, i.e., forexample, an increase or a decrease in the expression or activity of anadenylate kinase protein.

C. Methods of Treatment

The present invention provides for both prophylactic and therapeuticmethods of treating a subject at risk of (or susceptible to) a disorderor having a disorder associated with aberrant adenylate kinaseexpression or activity. Additionally, the compositions of the inventionfind use in the treatment of disorders described herein.

1. Prophylactic Methods

In one aspect, the invention provides a method for preventing in asubject a disease or condition associated with an aberrant adenylatekinase expression or activity by administering to the subject an agentthat modulates adenylate kinase expression or at least one adenylatekinase gene activity. Subjects at risk for a disease that is caused, orcontributed to, by aberrant adenylate kinase expression or activity canbe identified by, for example, any or a combination of diagnostic orprognostic assays as described herein. Administration of a prophylacticagent can occur prior to the manifestation of symptoms characteristic ofthe adenylate kinase aberrancy, such that a disease or disorder isprevented or, alternatively, delayed in its progression. Depending onthe type of adenylate kinase aberrancy, for example, an adenylate kinaseagonist or adenylate kinase antagonist agent can be used for treatingthe subject. The appropriate agent can be determined based on screeningassays described herein.

2. Therapeutic Methods

Another aspect of the invention pertains to methods of modulatingadenylate kinase expression or activity for therapeutic purposes. Themodulatory method of the invention involves contacting a cell with anagent that modulates one or more of the activities of adenylate kinaseprotein activity associated with the cell. An agent that modulatesadenylate kinase protein activity can be an agent as described herein,such as a nucleic acid or a protein, a naturally-occurring cognateligand of an adenylate kinase protein, a peptide, an adenylate kinasepeptidomimetic, or other small molecule. In one embodiment, the agentstimulates one or more of the biological activities of adenylate kinaseprotein. Examples of such stimulatory agents include active adenylatekinase protein and a nucleic acid molecule encoding an adenylate kinaseprotein that has been introduced into the cell. In another embodiment,the agent inhibits one or more of the biological activities of adenylatekinase protein. Examples of such inhibitory agents include antisenseadenylate kinase nucleic acid molecules and anti-adenylate kinaseantibodies.

These modulatory methods can be performed in vitro (e.g., by culturingthe cell with the agent) or, alternatively, in vivo (e.g., byadministering the agent to a subject). As such, the present inventionprovides methods of treating an individual afflicted with a disease ordisorder characterized by aberrant expression or activity of anadenylate kinase protein or nucleic acid molecule. In one embodiment,the method involves administering an agent (e.g., an agent identified bya screening assay described herein), or a combination of agents, thatmodulates (e.g., upregulates or downregulates) adenylate kinaseexpression or activity. In another embodiment, the method involvesadministering an adenylate kinase protein or nucleic acid molecule astherapy to compensate for reduced or aberrant adenylate kinaseexpression or activity.

Stimulation of adenylate kinase activity is desirable in situations inwhich an adenylate kinase protein is abnormally downregulated and/or inwhich increased adenylate kinase activity is likely to have a beneficialeffect. Conversely, inhibition of adenylate kinase activity is desirablein situations in which adenylate kinase activity is abnormallyupregulated and/or in which decreased adenylate kinase activity islikely to have a beneficial effect.

This invention may be embodied in many different forms and should not beconstrued as limited to the embodiments set forth herein; rather, theseembodiments are provided so that this disclosure will fully convey theinvention to those skilled in the art. Many modifications and otherembodiments of the invention will come to mind in one skilled in the artto which this invention pertains having the benefit of the teachingspresented in the foregoing description. Although specific terms areemployed, they are used as in the art unless otherwise indicated.

All publications and patent applications mentioned in the specificationare indicative of the level of those skilled in the art to which thisinvention pertains. All publications and patent applications are hereinincorporated by reference to the same extent as if each individualpublication or patent application was specifically and individuallyindicated to be incorporated by reference.

Although the foregoing invention has been described in some detail byway of illustration and example for purposes of clarity ofunderstanding, it will be obvious that certain changes and modificationsmay be practiced within the scope of the appended claims.

CHAPTER 2 21612, 21615, 21620, 21676, 33756, Novel Human AlcoholDehydrogenases BACKGROUND OF THE INVENTION

Alcohol dehydrogenases are ubiquitous enzymes that are and are generallyclassified as members of either the MDR (medium-chaindehydrogenase/reductase) or SDR (short-chain dehydrogenase/reductase)protein families. Members of the SDR and MDR families appear to havesimilar activities though they work via different mechanisms andstructures. The SDR superfamily comprises isomerases, lyases andoxidoreductases. The enzymes of this family cover a wide range ofsubstrate specificities including steroids, alcohols, and aromaticcompounds, however, most family members are known to be NAD⁺- orNADP⁺-dependent oxidoreductases. In the combined SDR superfamily, only asingle tyrosine residue is strictly conserved and ascribed a criticalenzymatic function. Members of the MDR superfamily are often multimericenzymes associated with 0, 1, or 2 zinc atoms. Substrates of the MDRenzymes are often alcohols and aldehydes. Six different classes ofmammalian ADH isoforms are members of the MDR family. In addition to theMDR and SDR families, alcohol dehydrogenases have also been associatedwith protein families reflecting iron-dependant enzymes, long-chainenzymes, and several types of prokaryotic enzymes with other cofactorrequirements.

Most dehydrogenase proteins function as dimers or tetramers and possessat least two domains: the first domain comprising the coenzyme bindingsite, and the second domain comprising the substrate binding site. Thislatter domain determines the substrate specificity and contains theamino acids involved in catalysis. ADHs have a variety of substratespecificities, but act primarily on primary or secondary alcohols,hemiacetals, cyclic secondary alcohols, or on the correspondingaldehydes and ketones. The catalytic role of ADH in mammalian ethanoloxidation is well studied. ADH catalyzes the conversion of ethanol toacetaldehyde using NAD⁺ as a cofactor. Specifically, the coenzyme bindsADH, followed by an interaction with ethanol, the ethanol issubsequently converted to acetaldehyde and the NAD⁺ is converted toNADH. Members of the mammalian ADH protein family have varyingelectrophoretic mobilities, Michaelis constants (binding affinities) forethanol, and sensitivities to pyrazol inhibition. For instance, class IADHs have low K_(m) values (less than 5 mM) for ethanol oxidation whileclass II and class IV ADHs have intermediate K_(m) values (about 30 mM).Class III ADH enzymes are not saturable with ethanol and virtuallyfunction exclusively as glutathione-dependent formaldehydedehydrogenases. Allelic variation of the mammalian genes have beenidentified. The kinetic properties of the resultant variants differsignificantly owing to single amino acid substitutions in the coenzymebinding domains of the enzymes.

Alcohol dehydrogenases play fundamental roles in degradative, synthetic,and detoxification pathways and have been implicated in a variety ofcritical developmental processes and pathophysiological disease states.For instance, allelic variations of ADH2 and ADH3 appear to influencethe susceptibility to alcoholism and alcoholic liver cirrhosis in Asians(Thomasson et al. (1991) Am. J. Hum Genet. 48:677-681, Chao et al.(1994) Hepatology 19:360-366, and Higuchi et al. (1995) Am. J.Psychiatry 152:1219-1221). Furthermore, first-pass metabolism is thedifference between the quantity of ethanol that reaches the systemiccirculation by the intravenous route and the quantity that entered bythe oral dose. Several lines of evidence now indicate that first-passmetabolism of alcohol in humans may occur in the liver via the activityof members of the mammalian ADH family (Yin et al. (1999) Enzymology andMolecular Biology of Carbonyl Metabolism 7, Plenum Publishers, NewYork).

ADHs are also involved in detoxification pathways. For instance, classIII ADH is unsaturable by ethanol and mainly functions as aglutathione-dependant formaldehyde dehydrogenase and is thereforeimportant for the elimination of endogenously formed formaldehyde. ADHsare also involved in the metabolism of nitrobenzaldehyde, a dietarycarcinogen. It has been suggested that the lack of σ-ADH in Japanesepatients may lead to a decreased detoxification of the dietarycarcinogen nitrobenzaldehyde and may possible be linked to the high rateof gastric cancer in Japanese (Baron et al. (1991) Life Sci 49:1929-34;Grab et al. (1977) Cancer Res 37:4181-90 and Seedcake et al (1980) RevEd 9:346-51). ADH is also involved in the activation of 1,2dimethylhydrazine, an experimentally used procarcinogen.

Retinoic acid is a ligand controlling a nuclear receptor signalingpathway that plays a key role in the regulation of embryonicdevelopment, spermatogenesis, and epithelial differentiation (Chambon etal. (1996) FASEB J. 10:940-954 and Mangelsdorf et al. (1995) Cell83:841-850). The synthesis of retinoic acid occurs via the oxidation ofretinol to retinal followed by the conversion of retinal to retinoicacid. Members of the alcohol dehydrogenase and short-chaindehydrogenase/reductase families catalyze the reversible, rate limitingconversion of retinol to retinal, while the oxidation of retinal toretinoic acid is catalyzed by members of the aldehyde dehydrogenase orP450 enzyme families (Deuster et al. (1996) Biochemistry35:12221-12227). Therefore, members of the ADH family influence thegrowth and developmental processes mediated by the active metaboliteretinoic acid.

ADH metabolism of retinol to retinal is inhibited by ethanol, and thismay lead to altered epithelial cell differentiation and malignant celltransformation. Furthermore, it has been suggested that the ability ofethanol to inhibit the oxidation of retinol by ADH underlies thepathology of fetal alcohol syndrome, a birth defect characterized bycraniofacial, limb, and brain malformations (Duester et al. (1991)Alcohol Clin Exp Res 15:568-572). Retinoic acid also functions tomaintain differentiation of epithelial cells and influencesspermatogenesis in adult vertebrates (Chambon et al. (1996) FASEB J.10:940-954). Data suggests that retinoic acid signaling inspermatogenesis and keratinocyte differentiation may be significantlydisrupted by ethanol through ADH pathways. It has been proposed thatinhibition of retinol metabolism by ethanol may be responsible for thetesticular atrophy and spermatogenesis commonly seen in male chronicalcoholics. Furthermore, skin diseases such as psoriasis, have beenassociated with heavy drinking.

ADH may also play a role in colorectal cancers. During colorectalcarcinogenesis, ADH activity is significantly decreased in polyps andfurther decreased in cancer tissue. (Egerer et al. (1997)Gastroenterology 112:A1260). Furthermore, epidemiological studies havedemonstrated that alcohol consumption is a risk factor for developmentof oral, esophageal, colorectal, and upper gastrointestinal cancers(Blot et al. (1992) Cancer Res 52:2119s-2123s). The role of ADH incancers of these various tissues may result from the production ofacetaldehyde following oxidation of ethanol by ADH, an alteration inretinol metabolism or through the role of ADH in carcinogen metabolism.

Further functional links between disease and the oxidative/reductiveactions of various dehydrogenases are being established. For instance,ERAB is a member of the short-chain dehydrogenase/reductase family.Interactions between and Amyloid β peptide and ERAB have been shown tomediate neurotoxicity and apoptosis in neuronal cell lines (Yan et al.(1997) Nature 389:689-693) and thus are being implicated in thepathogenesis of neurodegenerative disorders like Alzheimer's disease(Oppermann et al. (1999) Enzymology and Molecular Biology of CarbonylMetabolism 7, Plenum Publishers, New York and Oppermann et al. (1999)FEBS Letters 451:238-242).

Accordingly, ADHs are a major target for drug action and development.Therefore, it is valuable to the field of pharmaceutical development toidentify and characterize previously unknown ADHs. The present inventionadvances the state of the art by providing previously unidentified humanalcohol dehydrogenases.

SUMMARY OF THE INVENTION

It is an object of the invention to identify novel alcoholdehydrogenases.

It is a further object of the invention to provide novel alcoholdehydrogenase polypeptides that are useful as reagents or targets inassays applicable to treatment and diagnosis of ADH-mediated or -relateddisorders.

It is a further object of the invention to provide polynucleotidescorresponding to the novel ADH polypeptides that are useful as targetsand reagents in ADH assays applicable to treatment and diagnosis ofADH-mediated or -related disorders and useful for producing novel ADHpolypeptides by recombinant methods.

A specific object of the invention is to identify compounds that act asagonists and antagonists and modulate the expression of the novel ADHs.

A further specific object of the invention is to provide compounds thatmodulate expression of the alcohol dehydrogenases for treatment anddiagnosis of ADH-related disorders.

The invention is thus based on the identification of novel human alcoholdehydrogenases. The amino acid sequence for ADH 21620, 33756, 21676,21612, and 21615, are shown in SEQ ID NOS:5, 7, 9, 11, and 13,respectfully. The nucleotide sequence for ADH 21620, 33756, 21676,21612, and 21615 are shown in SEQ ID NOS:6, 8, 10, 12, and 14,respectfully.

The invention provides isolated ADH polypeptides, including apolypeptide having the amino acid sequence shown in SEQ ID NOS:5, 7, 9,11, and 13, or the amino acid sequence encoded by the cDNA depositedwith American Type Culture Collection (ATCC), University Boulevard,Manassas, Va. 20110-2209, as Patent Deposit No. PTA-2012 (correspondingto the 33756 nucleotide sequence) on Jun. 9, 2000; Patent Deposit No.PTA-2170 (corresponding to the 21612 nucleotide sequence) on Jun. 27,2000, Patent Deposit No. PTA-2171 (corresponding to the 21620 nucleotidesequence) on Jun. 27, 2000, as Patent Deposit No. PTA-2812(corresponding to the 21615 nucleotide sequence) on Dec. 14, 2000, andas Patent Deposit No. PTA-2813 (corresponding to the 21676 nucleotidesequence) on Dec. 14, 2000. ATCC Patent Deposits PTA-2012, PTA-2170,PTA-2171, PTA-2812, and PTA-2813 are referred to collectively herein as“the deposited cDNAs.”

The invention also provides isolated ADH nucleic acid molecules havingthe sequences shown in SEQ ID NOS:6, 8, 10, 12, and 14, or in thedeposited cDNAs.

The invention also provides variant polypeptides having an amino acidsequence that is substantially homologous to the amino acid sequencesshown in SEQ ID NOS:5, 7, 9, 11, and 13, or encoded by the depositedcDNAs.

The invention also provides variant nucleic acid sequences that aresubstantially homologous to the nucleotide sequences shown in SEQ IDNOS:6, 8, 10, 12, and 14, or in the deposited cDNAs.

The invention also provides fragments of the polypeptides shown in SEQID NOS:5, 7, 9, 11, and 13, and nucleotide sequences shown in SEQ IDNOS:6, 8, 10, 12, and 14, as well as substantially homologous fragmentsof the polypeptides or nucleic acids.

The invention further provides nucleic acid constructs comprising thenucleic acid molecules described herein. In a preferred embodiment, thenucleic acid molecules of the invention are operatively linked to aregulatory sequence.

The invention also provides vectors and host cells for expressing theADH nucleic acid molecules and polypeptides, and particularlyrecombinant vectors and host cells.

The invention also provides methods of making the vectors and host cellsand methods for using them to produce the ADH nucleic acid molecules andpolypeptides.

The invention also provides antibodies or antigen-binding fragmentsthereof that selectively bind the ADH polypeptides and fragments.

The invention also provides methods of screening for compounds thatmodulate expression or activity of the ADH polypeptides or nucleic acid(RNA or DNA).

The invention also provides a process for modulating ADH polypeptide ornucleic acid expression or activity, especially using the screenedcompounds. Modulation may be used to treat conditions related toaberrant activity or expression of the ADH polypeptides or nucleicacids.

The invention also provides assays for determining the activity of orthe presence or absence of the ADH polypeptides or nucleic acidmolecules in a biological sample, including for disease diagnosis.

The invention also provides assays for determining the presence of amutation in the polypeptides or nucleic acid molecules, including fordisease diagnosis.

In still a further embodiment, the invention provides a computerreadable means containing the nucleotide and/or amino acid sequences ofthe nucleic acids and polypeptides of the invention, respectively.

DETAILED DESCRIPTION OF THE INVENTION

The present inventions now will be described more fully hereinafter withreference to the accompanying drawings, in which some, but not allembodiments of the invention are shown. Indeed, these inventions may beembodied in many different forms and should not be construed as limitedto the embodiments set forth herein; rather, these embodiments areprovided so that this disclosure will satisfy applicable legalrequirements. Like numbers refer to like elements throughout.

Many modifications and other embodiments of the inventions set forthherein will come to mind to one skilled in the art to which theseinventions pertain having the benefit of the teachings presented in theforegoing descriptions and the associated drawings. Therefore, it is tobe understood that the inventions are not to be limited to the specificembodiments disclosed and that modifications and other embodiments areintended to be included within the scope of the appended claims.Although specific terms are employed herein, they are used in a genericand descriptive sense only and not for purposes of limitation.

Polypeptides

The invention is based on the discovery of novel human alcoholdehydrogenases. Specifically, an expressed sequence tag (EST) wasselected based on homology to the alcohol dehydrogenase sequence. ThisEST was used to design primers based on sequences that it contains andused to identify cDNAS from human cDNA libraries, including primaryosteoblasts. Positive clones were sequenced and the overlappingfragments were assembled. Analysis of each of the assembled sequencesrevealed that the cloned cDNA molecules encode ADHs.

The invention thus relates to novel ADHs having the deduced amino acidsequence shown in FIGS. 7A-B, 13, 17A-B, 21A-B, and 25A-B, or the aminoacid sequences shown in SEQ ID NOS:5, 7, 9, 11, and 13, or the aminoacid sequences encoded by the deposited cDNAs as Patent Deposit NumbersPTA-2012, PTA-2170, or PTA-2171.

The deposits will be maintained under the terms of the Budapest Treatyon the International Recognition of the Deposit of Microorganisms. Thedeposits are provided as a convenience to those of skill in the art andis not an admission that a deposit is required under 35 U.S.C. § 112.The deposited sequences, as well as the polypeptides encoded by thesequences, are incorporated herein by reference and controls in theevent of any conflict, such as a sequencing error, with description inthis application.

“ADH polypeptide” or “ADH protein” refers to the polypeptides in SEQ IDNOS:5, 7, 9, 11, and 13, or the polypeptides encoded by the depositedcDNAs. The term “ADH protein” or “ADH polypeptide”, however, furtherincludes the numerous variants described herein, as well as fragmentsderived from the full-length ADHs and variants.

Tissues and/or cells in which the 21620 ADH is found include, but arenot limited to those shown in FIGS. 11 and 12. Tissues in which the geneis highly expressed include brain, colon, kidney, and small intestine.Moderate expression occurs in liver, muscle, and testes. Lower positiveexpression occurs in the aorta, breast, cervix, esophagus, heart, lung,lymph, ovary, placenta, spleen, thymus, thyroid, and vein. The 21620 ADHis also expressed in malignant breast, lung, and colon tissue, and inliver metastases derived from malignant colonic tissue.

The present invention thus provides isolated or purified polypeptides ofthe 21620 ADH, 33756 ADH, 21676 ADH, 21612 ADH, and 21615 ADH andvariants and fragments thereof.

The short-chain alcohol dehydrogenase family signature is found in the21620 ADH from about amino acid 166 to about amino acid 176 and in the21615 ADH from about amino acid 147 to about amino acid 157.

Based on a Blast search, highest homology to the 21620 ADH was shown toAntennal-specific Short-chain Dehydrogenase/reductase from Drosophilamelanogaster (Genbank Acc. No. AF116553) and to the Oxidoreductase fromHaloferax volcani (Genbank Acc. No. U95375).

Based on a Blast search, highest homology to the 33756 ADH was shown toCGI-82 from Homo sapiens (Genbank Acc. No. AF151840), UBE-1b from Musmusculus (Genbank Acc. No. AB030504), UBE-1a from Mus musculus (GenbankAcc. No. AB030503).

Based on a Blast search, no significant homology was found to the 21676ADH.

Based on a Blast search, highest homology to the 21612 ADH was shown toa protein similar to alcohol dehydrogenase from C. elegans (Genbank Acc.No. U28739), a protein similar to alcohol dehydrogenase from C. elegans(Genbank Acc. No. Z74029), and to the hypothetical protein RV3224 fromMycobacterium tuberculosis (Genbank Acc. No. Z95120).

Based on a Blast search, highest homology to the 21615 ADH was shown toa 3-oxoacyl-(acyl carrier protein) reductase from Thermotoga maritima(Genbank Acc. No. AAD36790).

As used herein, a polypeptide is said to be “isolated” or “purified”when it is substantially free of cellular material when it is isolatedfrom recombinant and non-recombinant cells, or free of chemicalprecursors or other chemicals when it is chemically synthesized. Apolypeptide, however, can be joined to another polypeptide with which itis not normally associated in a cell and still be considered “isolated”or “purified.”

The ADH polypeptides can be purified to homogeneity. It is understood,however, that preparations in which the polypeptide is not purified tohomogeneity are useful and considered to contain an isolated form of thepolypeptide. The critical feature is that the preparation allows for thedesired function of the polypeptide, even in the presence ofconsiderable amounts of other components. Thus, the inventionencompasses various degrees of purity.

In one embodiment, the language “substantially free of cellularmaterial” includes preparations of the ADH having less than about 30%(by dry weight) other proteins (i.e., contaminating protein), less thanabout 20% other proteins, less than about 10% other proteins, or lessthan about 5% other proteins. When the polypeptide is recombinantlyproduced, it can also be substantially free of culture medium, i.e.,culture medium represents less than about 20%, less than about 10%, orless than about 5% of the volume of the protein preparation.

An ADH polypeptide is also considered to be isolated when it is part ofa membrane preparation or is purified and then reconstituted withmembrane vesicles or liposomes.

The language “substantially free of chemical precursors or otherchemicals” includes preparations of the ADH polypeptide in which it isseparated from chemical precursors or other chemicals that are involvedin its synthesis. In one embodiment, the language “substantially free ofchemical precursors or other chemicals” includes preparations of thepolypeptide having less than about 30% (by dry weight) chemicalprecursors or other chemicals, less than about 20% chemical precursorsor other chemicals, less than about 10% chemical precursors or otherchemicals, or less than about 5% chemical precursors or other chemicals.

In one embodiment, the ADH polypeptides comprise the amino acidsequences shown in SEQ ID NOS:5, 7, 9, 11, and 13. However, theinvention also encompasses sequence variants. Variants include asubstantially homologous protein encoded by the same genetic locus in anorganism, i.e., an allelic variant.

The 21620 ADH has been mapped to human chromosome 17 (17q12-21) withflanking markers WI-3010 (9.7cR) and WI-4251 (17.3cR). Mutations nearthis locus include, but are not limited to, the following: wilms tumor4; patella aplasia or hypoplasia; psoriasis susceptibility 2 (psors2);malignant hyperthermia susceptibility 2 (MSH2); pallidopontonigraldegeneration (PPND); pseudohypoaldosteronism type II locus B; andgliosis and familial progressive subcortical. In the mouse this locus isassociated with the following: susceptibility to lung cancer (Sluc4);pulmonary adenoma resistance (Par1); radiation-induced apoptosis 4(Rapop4); cocked (co); open eyelids (oe); ovum mutant (Om); rimy (rmy);susceptibility to experimental allergic encephalomyelitis 7 (Eae7);liver weight QTL 4 (Lwq4); alopecia (Al); spleen weight OTL 1 (Swq1);modifier of von willebrand factor (Mvwf); neuron number control (Nnc1);recombination induced mutation 3 (rim3); bald-arthritic (Bda); bare skin(Bsk); rex (re); alymphoplasia (aly); cleft lip 1 (clf1); seizuresusceptibility 3 (Szs3); uncovered (Uncv). Genes near this locus includeCDC18L, RARA, PDE6G, IGFBP4, TCFL4, NAGLU, FZD2, PYY, ERBB2, RABL,SCYA11, KRT12, NEUROD2, SLC6A4, ACACA, SCYA1, and BRCA1.

The 21612 ADH has been mapped to human chromosome 9 (9q22-33) withflanking markers WI-6207 (5.7 cR) and D9S174 (6.0 cR). Mutations nearthis locus include, but are not limited to, the following:hypomagnesemia with secondary hypocalcemia (HOMG); hemophagocyticlymphohistiocytosis, familial 1; nephronophthisis (NPHP2), infantile;HSN1, neuropathy, hereditary sensory, type 1; high density lipoproteindeficiency (HDLDT1), tangier type 1; dysautonomia (dys), familial;muscular dystrophy, limb-girdle, type 2H; acrofacial dysostosis 1(AFD1), nager type; amyotrophic lateral sclerosis 4 (ALS4), juvenile;and multiple self-healing squamous epithelioma (MSSE). In the mouse thislocus is associated with the following: vacillans (vc), whirler (wi),ochre (och), Hertwig's anemia (an), b-associated fitness (baf), irisstromal atrophy (is a), lymphoma resistance (lyr), and systemic lupuserythmatosus susceptibility 2 (sle2). Genes near this locus includeSCYA5, ZFP37, UGCG, SLC31A2, HXB, HPRP4P, ORM1, TNFSF8, TXN, IKBAKAP,PTPN3, EDG2, CSMF, chondrosarcoma, myxoid extraskeletal, and fused toEWS.

Variants also encompass proteins derived from other genetic loci in anorganism, but having substantial homology to the ADHs of SEQ ID NOS:5,7, 9, 11, and 13. Variants also include proteins substantiallyhomologous to the ADHs but derived from another organism, i.e., anortholog. Variants also include proteins that are substantiallyhomologous to the ADHs that are produced by chemical synthesis. Variantsalso include proteins that are substantially homologous to the ADHs thatare produced by recombinant methods. It is understood, however, thatvariants exclude any amino acid sequences disclosed prior to theinvention.

As used herein, two proteins (or a region of the proteins) aresubstantially homologous when the amino acid sequences are at leastabout 70-75%, typically at least about 80-85%, and most typically atleast about 90-95% or more homologous. A substantially homologous aminoacid sequence, according to the present invention, will be encoded by anucleic acid sequence hybridizing to the nucleic acid sequence, orportion thereof, of the sequence shown in SEQ ID NOS:6, 8, 10, 12, and14 under stringent conditions as more fully described below.

To determine the percent identity of two amino acid sequences or of twonucleic acid sequences, the sequences are aligned for optimal comparisonpurposes (e.g., gaps can be introduced in one or both of a first and asecond amino acid or nucleic acid sequence for optimal alignment andnon-homologous sequences can be disregarded for comparison purposes). Ina preferred embodiment, the length of a reference sequence aligned forcomparison purposes is at least 30%, preferably at least 40%, morepreferably at least 50%, even more preferably at least 60%, and evenmore preferably at least 70%, 80%, or 90% of the length of the referencesequence (e.g., when aligning a second sequence to the amino acidsequences herein having 502 amino acid residues, at least 165,preferably at least 200, more preferably at least 250, even morepreferably at least 300, and even more preferably at least 350, 400,450, and 500 amino acid residues are aligned). The amino acid residuesor nucleotides at corresponding amino acid positions or nucleotidepositions are then compared. When a position in the first sequence isoccupied by the same amino acid residue or nucleotide as thecorresponding position in the second sequence, then the molecules areidentical at that position (as used herein amino acid or nucleic acid“identity” is equivalent to amino acid or nucleic acid “homology”). Thepercent identity between the two sequences is a function of the numberof identical positions shared by the sequences, taking into account thenumber of gaps, and the length of each gap, which need to be introducedfor optimal alignment of the two sequences.

The invention also encompasses polypeptides having a lower degree ofidentity but having sufficient similarity so as to perform one or moreof the same functions performed by the ADH. Similarity is determined byconserved amino acid substitution. Such substitutions are those thatsubstitute a given amino acid in a polypeptide by another amino acid oflike characteristics. Conservative substitutions are likely to bephenotypically silent. Typically seen as conservative substitutions arethe replacements, one for another, among the aliphatic amino acids Ala,Val, Leu, and Ile; interchange of the hydroxyl residues Ser and Thr,exchange of the acidic residues Asp and Glu, substitution between theamide residues Asn and Gln, exchange of the basic residues Lys and Argand replacements among the aromatic residues Phe, Tyr. Guidanceconcerning which amino acid changes are likely to be phenotypicallysilent are found in Bowie et al., Science 247:1306-1310 (1990).

TABLE 1 Conservative Amino Acid Substitutions. Aromatic PhenylalanineTryptophan Tyrosine Hydrophobic Leucine Isoleucine Valine PolarGlutamine Asparagine Basic Arginine Lysine Histidine Acidic AsparticAcid Glutamic Acid Small Alanine Serine Threonine Methionine Glycine

The comparison of sequences and determination of percent identity andsimilarity between two sequences can be accomplished using amathematical algorithm. (Computational Molecular Biology, Lesk, A. M.,ed., Oxford University Press, New York, 1988; Biocomputing: Informaticsand Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993;Computer Analysis of Sequence Data, Part 1, Griffin, A. M., and Griffin,H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis inMolecular Biology, von Heinje, G., Academic Press, 1987; and SequenceAnalysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press,New York, 1991).

A preferred, non-limiting example of such a mathematical algorithm isdescribed in Karlin et al. (1993) Proc. Natl. Acad. Sci. USA90:5873-5877. Such an algorithm is incorporated into the NBLAST andXBLAST programs (version 2.0) as described in Altschul et al. (1997)Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLASTprograms, the default parameters of the respective programs (e.g.,NBLAST) can be used. See www.ncbi.nlm.nih.gov. In one embodiment,parameters for sequence comparison can be set at score=100,wordlength=12, or can be varied (e.g., W=5 or W=20).

In a preferred embodiment, the percent identity between two amino acidsequences is determined using the Needleman et al. (1970) (J. Mol. Biol.48:444-453) algorithm which has been incorporated into the GAP programin the GCG software package (available at www.gcg.com), using either aBLOSUM 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10,8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet anotherpreferred embodiment, the percent identity between two sequences isdetermined using the GAP program in the GCG software package (Devereuxet al. (1984) Nucleic Acids Res. 12(1):387) (available at www.gcg.com),using a NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80and a length weight of 1, 2, 3, 4, 5, or 6.

Another preferred, non-limiting example of a mathematical algorithmutilized for the comparison of sequences is the algorithm of Myers andMiller, CABIOS (1989). Such an algorithm is incorporated into the ALIGNprogram (version 2.0) which is part of the CGC sequence alignmentsoftware package. When utilizing the ALIGN program for comparing aminoacid sequences, a PAM120 weight residue table, a gap length penalty of12, and a gap penalty of 4 can be used. Additional algorithms forsequence analysis are known in the art and include ADVANCE and ADAM asdescribed in Torellis et al. (1994) Comput. Appl. Biosci. 10:3-5; andFASTA described in Pearson et al. (1988) PNAS 85:2444-8.

A variant polypeptide can differ in amino acid sequence by one or moresubstitutions, deletions, insertions, inversions, fusions, andtruncations or a combination of any of these.

Variant polypeptides can be fully functional or can lack function in oneor more activities. For example, variants of the ADHs can have analtered developmental expression, temporal expression ortissue-preferred expression. ADH variants can also have an alteredinteraction with cellular components, substrates, coenzymes, metal ions,or ADH subunits. An altered interaction comprising either a higher orlower affinity of the ADH for the various cellular components,substrates, coenzymes, metal ions, or ADH subunits. By “coenzyme” isintended a molecule that is associated with the ADH and is essential forADH activity. Some coenzymes are covalently linked to their enzyme whileothers are less tightly bound. A covalently linked coenzyme is referredto as a prosthetic group of the enzyme. By “coenzyme” is also intendedthe oxidized or reduced product of the coenzyme which is formedfollowing the enzymatic reaction mediated by the ADH polypeptide. Forexample, in the biological oxidation of an alcohol to an aldehyde, ahydrogen ion is transferred to the coenzyme NAD⁺ to form the coenzymeproduct NADH. Coenzymes of ADH include, but are not limited to, NAD⁺ andNAD⁺ analogues (Plapp et al. (1986) Biochemistry 25:5396-5402 andYamazaki et al. (1984) J. Biochem 95:109-115), β-NAD⁺ (Favilla et al.(1980) Eur. J. Biochem 104, 223-227 and Creagh et al. (1993) Biotechnol.Bioeng. 41:156-161, benzoylpyridine adenine dinucleotide (Samama et al.(1986) Eur. J. biochem. 159:375-380), NADH, NADP⁺, and NADPH. Variantsof ADH may also have altered interactions with metal ions including, butnot limited to, Zn²⁺, Co²⁺, Mg²⁺, Fe²⁺. See, for example, Yabe et al.(1992) Biosci. Biotechnol. Biochem. 56:338-339 and Leblov et al. (1972)Phytochemistry 11:1345-1346. Variants of ADH can also have an alteredinteraction with a substrate. Substrates of ADH include, but are notlimited to, primary or secondary alcohols or hemiacetals, and cyclicsecondary alcohols. By “substrate” is also intended the productsresulting from the oxidation of the above mentioned substrates. Suchproducts include, for example, various aldehydes and ketones. Othersubstrates include retinol, steroids, and carcinogens such asnitrobenzaldehyde and 1,2-dimethylhydrazine. Variants of ADH can alsohave an altered subunit interaction that affects the ability of ADH toform an active multimeric structure.

Useful variants of ADH polypeptides further include alterations incatalytic activity. The enzymatic reaction mediated by ADH is reversibleand comprises either the oxidation, i.e., removal of electrons, of theabove mentioned substrates or their reduction, i.e., addition ofelectrons. The catalytic reaction further comprises the oxidation orreduction of the coenzyme. Therefore, one embodiment involves a variantthat results in binding of the substrate but results in sloweroxidation/reduction or no oxidation/reduction of the substrate. Anothervariation can result in an increased rate of substrateoxidation/reduction. Other useful variation can include an alteredbinding affinity for a coenzyme or substrate. For example, an increasedor decreased binding affinity of a coenzyme can alter the bindingaffinity of the ADH to the substrate and also alter the rate ofsubstrate oxidation/reduction. Another variation can prevent the ADHmonomer from associating with other ADH subunits to form an activemultimeric complex.

Another useful variation provides a fusion protein in which one or moredomains or subregions are operationally fused to one or more domains orsubregions from another ADH. Specifically, a domain or subregion can beintroduced that alters the coenzyme or substrate specificities or therate of the enzymatic reaction.

Fully functional variants typically contain only conservative variationsor variations in non-critical residues or in non-critical regions.Functional variants can also contain substitution of similar aminoacids, which results in no change or an insignificant change infunction. Alternatively, such substitutions may positively or negativelyaffect function to some degree.

Non-functional variants typically contain one or more non-conservativeamino acid substitutions, deletions, insertions, inversions, ortruncation or a substitution, insertion, inversion, or deletion in acritical residue or critical region.

As indicated, variants can be naturally-occurring or can be made byrecombinant means or chemical synthesis to provide useful and novelcharacteristics for the ADH polypeptide. This includes preventingimmunogenicity from pharmaceutical formulations by preventing proteinaggregation.

Amino acids that are essential for function can be identified by methodsknown in the art, such as site-directed mutagenesis or alanine-scanningmutagenesis (Cunningham et al. (1985) Science 244:1081-1085). The latterprocedure introduces single alanine mutations at every residue in themolecule. The resulting mutant molecules are then tested for biologicalactivity, such as the binding affinity for the coenzyme or substrate ordetermining the catalytic constants for substrate oxidation/reduction.Sites that are critical for coenzyme and substrate binding can also bedetermined by structural analysis such as crystallization, nuclearmagnetic resonance or photoaffinity labeling (Smith et al. (1992) J.Mol. Biol. 224:899-904; de Vos et al. (1992) Science 255:306-312).

The assays for ADH enzyme activity are well known in the art and can befound for example, in Oppermann et al. (1999) FEBS 451:238-242,Thomasson et al. (1993) Behavior Genetics 23:131-136, and Zubey (1988)Macmillan Publishing Company, New York. These assays include, but arenot limited to, determination of the Michaelis constants (K_(m)) or thedissociation constant for the ADH/substrate complex. Such analysis ofenzyme activity may be performed spectrophotometrically by recording thechange in absorbance of NAD⁺. The catalytic efficiency or k_(cat) canalso be measured. K_(cat) is defined as the maximum number of moleculesof substrate converted to product per active site per unit of time. Thespecificity constant (k_(cat)/K_(M)) can also be used to measure theability of the ADH to discriminate between competing substrates. Similarassays can also be performed to measure ADH/coenzyme interactions. Invivo measurements of ADH activity can be determined by pharmacokineticstudies. In such studies, an ethanol dose in administered and the bloodethanol concentration is monitored over time. The area under the timecurve indicates the rate of ethanol elimination from the system. Alarger blood alcohol concentration time curve indicates slower ethanolmetabolism.

Substantial homology can be to the entire nucleic acid or amino acidsequence or to fragments of these sequences.

The invention thus also includes polypeptide fragments of the ADHs.Fragments can be derived from the amino acid sequences shown in SEQ IDNOS:5, 7, 9, 11, and 13. However, the invention also encompassesfragments of the variants of the ADHs as described herein.

The fragments to which the invention pertains, however, are not to beconstrued as encompassing fragments that may be disclosed prior to thepresent invention. Accordingly, a fragment of the 21620 ADH can compriseat least about 9, 15, 20, 25, 30, 35, 40 or more contiguous amino acids.A fragment of the 33756 ADH can comprise at least about 21, 25, 30, 35,40, 45, 50, or more contiguous amino acids. A fragment of the 21676 ADHcan comprise at least about 7, 10, 15, 20, 25, 30, 35 or more contiguousamino acids. A fragment of the 21612 ADH can comprise at least about 14,20, 25, 30, 35, 40 or more contiguous amino acids. A fragment of the21615 ADH can comprise at least about 7, 10, 15, 20, 25, 30, 35 or morecontiguous amino acids. Fragments can retain one or more of thebiological activities of the protein, for example the ability to bind acoenzyme or substrate or the ability catalyze the oxidation/reduction ofa substrate. Alternatively, fragments can be used as an immunogen togenerate ADH antibodies.

Biologically active fragments (peptides which are, for example, 5, 7,10, 12, 15, 20, 30, 35, 36, 37, 38, 39, 40, 50, 100 or more amino acidsin length) can comprise a domain or motif, e.g., catalytic site,substrate binding site, coenzyme binding site, short-chain alcoholdehydrogenase signature, microbodies C-terminal targeting signals, andsites for glycosylation, protein kinase C phosphorylation, casein kinaseII phosphorylation, tyrosine kinase phosphorylation, andN-myristoylation. Further possible fragments include sites important forcellular and subcellular targeting.

Such domains or motifs can be identified by means of routinecomputerized homology searching procedures.

Fragments, for example, can extend in one or both directions from thefunctional site to encompass 5, 10, 15, 20, 30, 40, 50, or up to 100amino acids. Further, fragments can include sub-fragments of thespecific domains mentioned above, which sub-fragments retain thefunction of the domain from which they are derived.

These regions can be identified by well-known methods involvingcomputerized homology analysis.

The invention also provides fragments with immunogenic properties. Thesecontain an epitope-bearing portion of the ADH or ADH variants. Theseepitope-bearing peptides are useful to raise antibodies that bindspecifically to an ADH polypeptide or region or fragment. These peptidescan contain at least 10, 12, at least 14, or between at least about 15to about 30 amino acids.

Non-limiting examples of antigenic polypeptides that can be used togenerate antibodies include but are not limited to peptides derived froman extracellular site. Regions having a high antigenicity index areshown in FIGS. 8, 14, 18, 22, and 26, for the 21620, 33756, 21676,21612, and 21615 ADHs, respectfully. However, intracellularly-madeantibodies (“intrabodies”) are also encompassed, which would recognizeintracellular peptide regions.

The epitope-bearing ADH polypeptides may be produced by any conventionalmeans (Houghten, R. A. (1985) Proc. Natl. Acad. Sci. USA 82:5131-5135).Simultaneous multiple peptide synthesis is described in U.S. Pat. No.4,631,211.

Fragments can be discrete (not fused to other amino acids orpolypeptides) or can be within a larger polypeptide. Further, severalfragments can be comprised within a single larger polypeptide. In oneembodiment a fragment designed for expression in a host can haveheterologous pre- and pro-polypeptide regions fused to the aminoterminus of the ADH fragment and an additional region fused to thecarboxyl terminus of the fragment.

The invention thus provides chimeric or fusion proteins. These comprisean ADH peptide sequence operatively linked to a heterologous peptidehaving an amino acid sequence not substantially homologous to the ADH.“Operatively linked” indicates that the ADH peptide and the heterologouspeptide are fused in-frame. The heterologous peptide can be fused to theN-terminus or C-terminus of the ADH or can be internally located.

In one embodiment the fusion protein does not affect ADH function perse. For example, the fusion protein can be a GST-fusion protein in whichthe ADH sequences are fused to the C-terminus of the GST sequences.Other types of fusion proteins include, but are not limited to,enzymatic fusion proteins, for example beta-galactosidase fusions, yeasttwo-hybrid GAL-4 fusions, poly-His fusions and Ig fusions. Such fusionproteins, particularly poly-His fusions, can facilitate the purificationof recombinant ADH. In certain host cells (e.g., mammalian host cells),expression and/or secretion of a protein can be increased by using aheterologous signal sequence. Therefore, in another embodiment, thefusion protein contains a heterologous signal sequence at itsN-terminus.

EP-A-O 464 533 discloses fusion proteins comprising various portions ofimmunoglobulin constant regions. The Fc is useful in therapy anddiagnosis and thus results, for example, in improved pharmacokineticproperties (EP-A 0232 262). In drug discovery, for example, humanproteins have been fused with Fc portions for the purpose ofhigh-throughput screening assays to identify antagonists (Bennett et al.(1995) J. Mol. Recog. 8:52-58 (1995) and Johanson et al. J. Biol. Chem.270:9459-9471). Thus, this invention also encompasses soluble fusionproteins containing an ADH polypeptide and various portions of theconstant regions of heavy or light chains of immunoglobulins of varioussubclass (IgG, IgM, IgA, IgE). Preferred as immunoglobulin is theconstant part of the heavy chain of human IgG, particularly IgG1, wherefusion takes place at the hinge region. For some uses it is desirable toremove the Fc after the fusion protein has been used for its intendedpurpose, for example when the fusion protein is to be used as antigenfor immunizations. In a particular embodiment, the Fc part can beremoved in a simple way by a cleavage sequence, which is alsoincorporated and can be cleaved with factor Xa.

A chimeric or fusion protein can be produced by standard recombinant DNAtechniques. For example, DNA fragments coding for the different proteinsequences are ligated together in-frame in accordance with conventionaltechniques. In another embodiment, the fusion gene can be synthesized byconventional techniques including automated DNA synthesizers.Alternatively, PCR amplification of gene fragments can be carried outusing anchor primers which give rise to complementary overhangs betweentwo consecutive gene fragments which can subsequently be annealed andre-amplified to generate a chimeric gene sequence (see Ausubel et al.(1992) Current Protocols in Molecular Biology). Moreover, manyexpression vectors are commercially available that already encode afusion moiety (e.g., a GST protein). An ADH-encoding nucleic acid can becloned into such an expression vector such that the fusion moiety islinked in-frame to the ADH.

Another form of fusion protein is one that directly affects ADHfunctions. Accordingly, an ADH polypeptide is encompassed by the presentinvention in which one or more of the ADH domains (or parts thereof) hasbeen replaced by homologous domains (or parts thereof) from another ADHor a short-chain dehydrogenase/reductase family member. Accordingly,various permutations are possible. For example, the substrate bindingdomain, or subregion thereof, can be replaced with the substrate bindingdomain or subregion from another ADH or a short-chaindehydrogenase/reductase family member. As a further example, thecatalytic domain, or coenzyme binding domains or parts thereof, can bereplaced with the appropriate domain from another ADH or SDR familymember. Thus, chimeric ADHs can be formed in which one or more of thenative domains or subregions has been replaced by another.

Additionally, chimeric ADH proteins can be produced in which one or morefunctional sites is derived from a different ADH or a short-chaindehydrogenase/reductase family member. It is understood however thatsites could be derived from the ADH or a short-chaindehydrogenase/reductase family members that occur in the mammaliangenome but which have not yet been discovered or characterized. Suchsites include but are not limited to the catalytic site, substratebinding site, coenzyme binding site, sites important for targeting tosubcellular and cellular locations, sites functional for interactionwith ADH subunits, protein kinase A phosphorylation sites, glycosylationsites, and other functional sites disclosed herein.

The isolated ADHs can be purified from cells that naturally express it.Tissues and cells that express high levels of the 21620 ADH include, butare not limited to, brain, colon, kidney, and small intestine. Moderatelevels of expression occur in the liver, muscle, and testes. Lowerpositive expression occurs in the aorta, breast, cervix, esophagus,heart, lung, lymph, ovary, placenta, spleen, thymus, thyroid, and vein.The 21620 ADH is also expressed in malignant breast, lung, and colontissue, and liver metastases derived from colon. The ADHs of the presentinvention can also be purified from cells that have been altered toexpress it (recombinant), or synthesized using known protein synthesismethods.

In one embodiment, the protein is produced by recombinant DNAtechniques. For example, a nucleic acid molecule encoding the ADHpolypeptide is cloned into an expression vector, the expression vectorintroduced into a host cell and the protein expressed in the host cell.The protein can then be isolated from the cells by an appropriatepurification scheme using standard protein purification techniques.Polypeptides often contain amino acids other than the 20 amino acidscommonly referred to as the 20 naturally-occurring amino acids. Further,many amino acids, including the terminal amino acids, may be modified bynatural processes, such as processing and other post-translationalmodifications, or by chemical modification techniques well known in theart. Common modifications that occur naturally in polypeptides aredescribed in basic texts, detailed monographs, and the researchliterature, and they are well known to those of skill in the art.

Accordingly, the polypeptides also encompass derivatives or analogs inwhich a substituted amino acid residue is not one encoded by the geneticcode, in which a substituent group is included, in which the maturepolypeptide is fused with another compound, such as a compound toincrease the half-life of the polypeptide (for example, polyethyleneglycol), or in which the additional amino acids are fused to the maturepolypeptide, such as a leader or secretory sequence or a sequence forpurification of the mature polypeptide or a pro-protein sequence.

Known modifications include, but are not limited to, acetylation,acylation, ADP-ribosylation, amidation, covalent attachment of flavin,covalent attachment of a heme moiety, covalent attachment of anucleotide or nucleotide derivative, covalent attachment of a lipid orlipid derivative, covalent attachment of phosphatidylinositol,cross-linking, cyclization, disulfide bond formation, demethylation,formation of covalent crosslinks, formation of cystine, formation ofpyroglutamate, formylation, gamma carboxylation, glycosylation, GPIanchor formation, hydroxylation, iodination, methylation,myristoylation, oxidation, proteolytic processing, phosphorylation,prenylation, racemization, selenoylation, sulfation, transfer-RNAmediated addition of amino acids to proteins such as arginylation, andubiquitination.

Such modifications are well-known to those of skill in the art and havebeen described in great detail in the scientific literature. Severalparticularly common modifications, glycosylation, lipid attachment,sulfation, gamma-carboxylation of glutamic acid residues, hydroxylationand ADP-ribosylation, for instance, are described in most basic texts,such as Proteins—Structure and Molecular Properties, 2nd ed., T.E.Creighton, W.H. Freeman and Company, New York (1993). Many detailedreviews are available on this subject, such as by Wold, F.,Posttranslational Covalent Modification of Proteins, B. C. Johnson, Ed.,Academic Press, New York 1-12 (1983); Seifter et al. (1990) Meth.Enzymol. 182: 626-646) and Rattan et al. (1992) Ann. N.Y. Acad. Sci.663:48-62).

As is also well known, polypeptides are not always entirely linear. Forinstance, polypeptides may be branched as a result of ubiquitination,and they may be circular, with or without branching, generally as aresult of post-translation events, including natural processing eventsand events brought about by human manipulation which do not occurnaturally. Circular, branched and branched circular polypeptides may besynthesized by non-translational natural processes and by syntheticmethods.

Modifications can occur anywhere in a polypeptide, including the peptidebackbone, the amino acid side-chains and the amino or carboxyl termini.Blockage of the amino or carboxyl group in a polypeptide, or both, by acovalent modification, is common in naturally-occurring and syntheticpolypeptides. For instance, the aminoterminal residue of polypeptidesmade in E. coli, prior to proteolytic processing, almost invariably willbe N-formylmethionine.

The modifications can be a function of how the protein is made. Forrecombinant polypeptides, for example, the modifications will bedetermined by the host cell posttranslational modification capacity andthe modification signals in the polypeptide amino acid sequence.Accordingly, when glycosylation is desired, a polypeptide should beexpressed in a glycosylating host, generally a eukaryotic cell. Insectcells often carry out the same posttranslational glycosylations asmammalian cells and, for this reason, insect cell expression systemshave been developed to efficiently express mammalian proteins havingnative patterns of glycosylation. Similar considerations apply to othermodifications.

The same type of modification may be present in the same or varyingdegree at several sites in a given polypeptide. Also, a givenpolypeptide may contain more than one type of modification.

Polypeptide Uses

The protein sequences of the present invention can be used as a “querysequence” to perform a search against public databases to, for example,identify other family members or related sequences. Such searches can beperformed using the NBLAST and XBLAST programs (version 2.0) of Altschulet al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can beperformed with the NBLAST program, score=100, wordlength=12 to obtainnucleotide sequences homologous to the nucleic acid molecules of theinvention. BLAST protein searches can be performed with the XBLASTprogram, score=50, wordlength=3 to obtain amino acid sequenceshomologous to the proteins of the invention. To obtain gapped alignmentsfor comparison purposes, Gapped BLAST can be utilized as described inAltschul et al., (1997) Nucleic Acids Res. 25(17):3389-3402. Whenutilizing BLAST and Gapped BLAST programs, the default parameters of therespective programs (e.g., XBLAST and NBLAST) can be used. Seewww.ncbi.nlm.nih.gov.

The ADH polypeptides are useful for producing antibodies specific forthe ADH, regions, or fragments. Regions having a high antigenicity indexscore are shown in FIGS. 8, 14, 18, 22, and 26 for the 21620 ADH, 33756ADH, 21676 ADH, 21612 ADH, and 21615 ADH, respectfully.

The ADH polypeptides are useful for biological assays related to ADHs.Such assays involve any of the known ADH functions or activities orproperties useful for diagnosis and treatment of ADH-related conditions.

The ADH polypeptides are also useful in drug screening assays, incell-based or cell-free systems. Cell-based systems can be native, i.e.,cells that normally express the ADH, as a biopsy or expanded in cellculture. In one embodiment, however, cell-based assays involverecombinant host cells expressing the ADH.

Determining the ability of the test compound to interact with the ADHcan also comprise determining the ability of the test compound topreferentially bind to the polypeptide as compared to the ability of aknown binding molecule (e.g. a coenzyme or substrate) to bind to thepolypeptide.

The polypeptides can be used to identify compounds that modulate ADHactivity. Such compounds, for example, can increase or decrease theaffinity of the substrate or coenzyme for ADH. Such compound can alsoincrease or decrease the enzymatic activity of the ADH. Additionally,such compounds can also alter the interaction of ADH with a metal ion oralter the ability of the ADH polypeptide to form a multimeric structure.Compounds that modulate ADH activity include, but are not limited to,pyrazole, 4-methylpyrazole, P-hydroxymercuribenzoate, o-Phenanthroline,iodoacetamide, iodoacetate, imidazole, colloidal bismuth subcitrate,cimetidine, ranitidine, and aspirin.

The ADHs of the present invention and appropriate variants and fragmentscan be used in high-throughput screens to assay candidate compounds forthe ability to bind to the ADH. These compounds can be further screenedagainst a functional ADH to determine the effect of the compound on theADH activity. Compounds can be identified that activate (agonist) orinactivate (antagonist) the ADH to a desired degree. Modulatory methodscan be performed in vitro (e.g., by culturing the cell with the agent)or, alternatively, in vivo (e.g., by administering the agent to asubject).

The ADH polypeptides can be used to screen a compound for the ability tostimulate or inhibit interaction between the ADH protein and a targetmolecule that normally interacts with the ADH protein. The target can bea coenzyme, metal ion, ADH substrate or another ADH subunit of themultimeric ADH enzyme. The assay includes the steps of combining the ADHprotein with a candidate compound under conditions that allow the ADHprotein or fragment to interact with the target molecule, and to detectthe formation of a complex between the ADH protein and the target or todetect the biochemical consequence of the interaction with the ADH andthe target, such as the oxidation/reduction of the substrate orcoenzyme.

Determining the ability of the ADH to bind to a target molecule can alsobe accomplished using a technology-such as real-time BimolecularInteraction Analysis (BIA). Sjolander et al. (1991) Anal. Chem.63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol.5:699-705. As used herein, “BIA” is a technology for studyingbiospecific interactions in real time, without labeling any of theinteractants (e.g., BIAcore™). Changes in the optical phenomenon surfaceplasmon resonance (SPR) can be used as an indication of real-timereactions between biological molecules.

The test compounds of the present invention can be obtained using any ofthe numerous approaches in combinatorial library methods known in theart, including: biological libraries; spatially addressable parallelsolid phase or solution phase libraries; synthetic library methodsrequiring deconvolution; the ‘one-bead one-compound’ library method; andsynthetic library methods using affinity chromatography selection. Thebiological library approach is limited to polypeptide libraries, whilethe other four approaches are applicable to polypeptide, non-peptideoligomer or small molecule libraries of compounds (Lam (1997) AnticancerDrug Des. 12:145).

Examples of methods for the synthesis of molecular libraries can befound in the art, for example in DeWitt et al. (1993) Proc. Natl. Acad.Sci. USA 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422;Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993)Science 261:1303; Carell et al. (1994) Angew. Chem. Int. Ed. Engl.33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; andin Gallop et al. (1994) J. Med. Chem. 37:1233. Libraries of compoundsmay be presented in solution (e.g., Houghten (1992) Biotechniques13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor(1993) Nature 364:555-556), bacteria (Ladner U.S. Pat. No. 5,223,409),spores (Ladner U.S. Pat. No. '409), plasmids (Cull et al. (1992) Proc.Natl. Acad. Sci. USA 89:1865-1869) or on phage (Scott and Smith (1990)Science 249:386-390); (Devlin (1990) Science 249:404-406); (Cwirla etal. (1990) Proc. Natl. Acad. Sci. 97:6378-6382); (Felici (1991) J. Mol.Biol. 222:301-310); (Ladner supra).

Candidate compounds include, for example, 1) peptides such as solublepeptides, including Ig-tailed fusion peptides and members of randompeptide libraries (see, e.g., Lam et al. (1991) Nature 354:82-84;Houghten et al. (1991) Nature 354:84-86) and combinatorialchemistry-derived molecular libraries made of D- and/or L-configurationamino acids; 2) phosphopeptides (e.g., members of random and partiallydegenerate, directed phosphopeptide libraries, see, e.g., Songyang etal. (1993) Cell 72:767-778); 3) antibodies (e.g., polyclonal,monoclonal, humanized, anti-idiotypic, chimeric, and single chainantibodies as well as Fab, F(ab′)₂, Fab expression library fragments,and epitope-binding fragments of antibodies); and 4) small organic andinorganic molecules (e.g., molecules obtained from combinatorial andnatural product libraries).

One candidate compound is a soluble full-length ADH or fragment thatcompetes for substrate binding or cofactor binding, interferes with theADH catalyzed reaction, or interferes with ADH subunit interactions.Other candidate compounds include mutant ADHs or appropriate fragmentscontaining mutations that affect ADH function and thus compete forcofactor binding or substrate binding or interfere with the ADHcatalyzed reaction or interferes with the ADH subunit interactions.Accordingly, a fragment that competes for substrate or coenzyme binding,for example with a higher affinity, or a fragment that binds substratebut does not catalyze its oxidation/reduction is encompassed by theinvention.

The invention provides other end points to identify compounds thatmodulate (stimulate or inhibit) ADH activity. The assays typicallyinvolve an assay of events that result from substrate or coenzymeoxidation/reduction that indicate ADH activity. Thus, the expression ofgenes that are up- or down-regulated in response to the ADH enzyme canbe assayed. In one embodiment, the regulatory region of such genes canbe operably linked to a marker that is easily detectable, such asluciferase.

Any of the biological or biochemical functions mediated by the ADH canbe used as an endpoint assay. These include all of the biochemical orbiological events described herein, in the references cited herein andincorporated by reference, and other ADH functions known to those ofordinary skill in the art.

In the case of ADH, specific end points can include an altered NADH/NAD⁺ratio. For instance, ethanol oxidation results in an increased NADH/NAD⁺redox potential within the cytosol and mitochondria with subsequentalteration in several tissue metabolites. For example, the increase incytosolic NADH/NAD⁺ ratio causes an increase in the lactate/pyruvateratio mediated via lactate dehydrogenase. Other consequences of ethanol-and acetaldehyde-induced redox changes include, enhanced triglyceridesynthesis, inhibition of Krebs cycle activity, lactic acidosis,ketoacidosis, hyperuricaemia and enhanced fibrogenesis. See, forexample, Peters et al. (1998) Novartis Foundation Symposium 216: 19-34,herein incorporated by reference.

Furthermore, the metabolism of ethanol via ADH results in the productionof acetaldehyde, which is removed by the action of acetaldehydedehydrogenases. Acetaldehyde alters various cellular function includingglutathione depletion and inhibition of nuclear repair enzymes.Acetaldehyde can also alter cellular membranes resulting in severecellular injury (Lieber et al. (1994) Gastroenterology 106:1085-105).Acetaldehyde toxicity depends on its net formation and can be increasedwhen ADH activity is low and acetaldehyde dehydrogenase activity ishigh. Additional end points that can be assayed include biologicalevents that are a consequence of ADH oxidation of retinol to retinal,which include but are not limited to differentiation of epithelium andspermatogenesis.

Binding and/or activating compounds can also be screened by usingchimeric ADH proteins in which one or more domains, sites, and the like,as disclosed herein, or parts thereof, can be replaced by theirheterologous counterparts derived from other ADHs or of any other shortchain dehydrogenase/reductase family member. For example, a substratebinding region or coenzyme binding region can be used that interactswith a different substrate or coenzyme specificity and/or affinity thanthe native ADH. Accordingly, a different set of oxidized/reducedsubstrates or coenzymes is available as an end-point assay foractivation. Alternatively, a heterologous targeting sequence can replacethe native targeting sequence. This will result in different subcellularor cellular localization. As a further alternative, sites that areresponsible for developmental, temporal, or tissue specificity can bereplace by heterologous sites such that the ADH can be detected underconditions of specific developmental, temporal, or tissue-specificexpression.

The ADH polypeptides are also useful in competition binding assays inmethods designed to discover compounds that interact with the ADH. Thus,a compound is exposed to an ADH polypeptide under conditions that allowthe compound to bind or to otherwise interact with the polypeptide.Soluble ADH polypeptide is also added to the mixture. If the testcompound interacts with the soluble ADH polypeptide, it decreases theamount of complex formed or activity from the ADH target. This type ofassay is particularly useful in cases in which compounds are sought thatinteract with specific regions of the ADH. Thus, the soluble polypeptidethat competes with the target ADH region is designed to contain peptidesequences corresponding to the region of interest.

Another type of competition-binding assay can be used to discovercompounds that interact with specific functional sites. As an example, asubstrate, such as ethanol, and a candidate compound can be added to asample of the ADH. Compounds that interact with the ADH at the same siteas the ethanol will reduce the amount of complex formed between the ADHand ethanol. Accordingly, it is possible to discover a compound thatspecifically prevents interaction between the ADH and ethanol. Anotherexample involves adding a candidate compound to a sample of ADH and acoenzyme, such as NAD⁺. A compound that competes with NAD⁺ will reducethe coenzyme interaction with ADH and thereby prevent the subsequentinteraction with a substrate or the oxidation of the substrate.Accordingly, compounds can be discovered that directly interact with theADH and compete with various coenzymes and substrates. Such assays caninvolve any other component that interacts with the ADH.

To perform cell free drug screening assays, it is desirable toimmobilize either the ADH, or fragment, or its target molecule tofacilitate separation of complexes from uncomplexed forms of one or bothof the proteins, as well as to accommodate automation of the assay.

Techniques for immobilizing proteins on matrices can be used in the drugscreening assays. In one embodiment, a fusion protein can be providedwhich adds a domain that allows the protein to be bound to a matrix. Forexample, glutathione-S-transferase/ADH fusion proteins can be adsorbedonto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) orglutathione derivatized microtitre plates, which are then combined withthe cell lysates (e.g., ³⁵S-labeled) and the candidate compound, and themixture incubated under conditions conducive to complex formation (e.g.,at physiological conditions for salt and pH). Following incubation, thebeads are washed to remove any unbound label, and the matrix immobilizedand radiolabel determined directly, or in the supernatant after thecomplexes is dissociated. Alternatively, the complexes can bedissociated from the matrix, separated by SDS-PAGE, and the level ofADH-binding protein found in the bead fraction quantitated from the gelusing standard electrophoretic techniques. For example, either thepolypeptide or its target molecule can be immobilized utilizingconjugation of biotin and streptavidin using techniques well known inthe art. Alternatively, antibodies reactive with the protein but whichdo not interfere with binding of the protein to its target molecule canbe derivatized to the wells of the plate, and the protein trapped in thewells by antibody conjugation. Preparations of an ADH-binding targetcomponent, such as a coenzyme or a substrate, and a candidate compoundare incubated in the ADH-presenting wells and the amount of complextrapped in the well can be quantitated. Methods for detecting suchcomplexes, in addition to those described above for the GST-immobilizedcomplexes, include immunodetection of complexes using antibodiesreactive with the ADJ target molecule, or which are reactive with ADHand compete with the target molecule; as well as enzyme-linked assayswhich rely on detecting an enzymatic activity associated with the targetmolecule.

Modulators of ADH activity identified according to these drug screeningassays can be used to treat a subject with a disorder mediated by ADH,by treating cells that express the ADH. These methods of treatmentinclude the steps of administering the modulators of ADH activity in apharmaceutical composition as described herein, to a subject in need ofsuch treatment.

The ADHs of the present invention are expressed in various cell types.Tissues and/or cells in which the 21620 ADH is found include, but arenot limited to those shown in FIGS. 11 and 12. Tissues in which the geneis highly expressed include brain, colon, kidney, and small intestine.Moderate expression occurs in liver, muscle, and testes. Lower positiveexpression occurs in the aorta, breast, cervix, esophagus, heart, lung,lymph, ovary, placenta, spleen, thymus, thyroid, and vein. The 21620 ADHis also expressed in the malignant breast, lung, and colon tissue, andin colon metastases to liver.

Hence the ADHs of the present invention are relevant to treatingdisorders involving these tissues. Of particular interest are malignantbreast, liver, colon and liver metastases derived from malignant colontissue.

Disorders involving the spleen include, but are not limited to,splenomegaly, including nonspecific acute splenitis, congestivespenomegaly, and spenic infarcts; neoplasms, congenital anomalies, andrupture. Disorders associated with splenomegaly include infections, suchas nonspecific splenitis, infectious mononucleosis, tuberculosis,typhoid fever, brucellosis, cytomegalovirus, syphilis, malaria,histoplasmosis, toxoplasmosis, kala-azar, trypanosomiasis,schistosomiasis, leishmaniasis, and echinococcosis; congestive statesrelated to partial hypertension, such as cirrhosis of the liver, portalor splenic vein thrombosis, and cardiac failure; lymphohematogenousdisorders, such as Hodgkin disease, non-Hodgkin lymphomas/leukemia,multiple mycloma, myeloproliferative disorders, hemolytic anemias, andthrombocytopenic purpura; immunologic-inflammatory conditions, such asrheumatoid arthritis and systemic lupus erythematosus; storage diseasessuch as Gaucher disease, Niemann-Pick disease, andmucopolysaccharidoses; and other conditions, such as amyloidosis,primary neoplasms and cysts, and secondary neoplasms.

Disorders involving the lung include, but are not limited to, congenitalanomalies; atelectasis; diseases of vascular origin, such as pulmonarycongestion and edema, including hemodynamic pulmonary edema and edemacaused by microvascular injury, adult respiratory distress syndrome(diffuse alveolar damage), pulmonary embolism, hemorrhage, andinfarction, and pulmonary hypertension and vascular sclerosis; chronicobstructive pulmonary disease, such as emphysema, chronic bronchitis,bronchial asthma, and bronchiectasis; diffuse interstitial(infiltrative, restrictive) diseases, such as pneumoconioses,sarcoidosis, idiopathic pulmonary fibrosis, desquamative interstitialpneumonitis, hypersensitivity pneumonitis, pulmonary eosinophilia(pulmonary infiltration with eosinophilia), Bronchiolitisobliterans-organizing pneumonia, diffuse pulmonary hemorrhage syndromes,including Goodpasture syndrome, idiopathic pulmonary hemosiderosis andother hemorrhagic syndromes, pulmonary involvement in collagen vasculardisorders, and pulmonary alveolar proteinosis; complications oftherapies, such as drug-induced lung disease, radiation-induced lungdisease, and lung transplantation; tumors, such as bronchogeniccarcinoma, including paraneoplastic syndromes, bronchioloalveolarcarcinoma, neuroendocrine tumors, such as bronchial carcinoid,miscellaneous tumors, and metastatic tumors; pathologies of the pleura,including inflammatory pleural effusions, noninflammatory pleuraleffusions, pneumothorax, and pleural tumors, including solitary fibroustumors (pleural fibroma) and malignant mesothelioma.

Disorders involving the colon include, but are not limited to,congenital anomalies, such as atresia and stenosis, Meckel diverticulum,congenital aganglionic megacolon-Hirschsprung disease; enterocolitis,such as diarrhea and dysentery, infectious enterocolitis, includingviral gastroenteritis, bacterial enterocolitis, necrotizingenterocolitis, antibiotic-associated colitis (pseudomembranous colitis),and collagenous and lymphocytic colitis, miscellaneous intestinalinflammatory disorders, including parasites and protozoa, acquiredimmunodeficiency syndrome, transplantation, drug-induced intestinalinjury, radiation enterocolitis, neutropenic colitis (typhlitis), anddiversion colitis; idiopathic inflammatory bowel disease, such as Crohndisease and ulcerative colitis; tumors of the colon, such asnon-neoplastic polyps, adenomas, familial syndromes, colorectalcarcinogenesis, colorectal carcinoma, and carcinoid tumors.

Disorders involving the liver include, but are not limited to, hepaticinjury; jaundice and cholestasis, such as bilirubin and bile formation;hepatic failure and cirrhosis, such as cirrhosis, portal hypertension,including ascites, portosystemic shunts, and splenomegaly; infectiousdisorders, such as viral hepatitis, including hepatitis A-E infectionand infection by other hepatitis viruses, clinicopathologic syndromes,such as the carrier state, asymptomatic infection, acute viralhepatitis, chronic viral hepatitis, and fulminant hepatitis; autoimmunehepatitis; drug- and toxin-induced liver disease, such as alcoholicliver disease; inborn errors of metabolism and pediatric liver disease,such as hemochromatosis, Wilson disease, α₁-antitrypsin deficiency, andneonatal hepatitis; intrahepatic biliary tract disease, such assecondary biliary cirrhosis, primary biliary cirrhosis, primarysclerosing cholangitis, and anomalies of the biliary tree; circulatorydisorders, such as impaired blood flow into the liver, including hepaticartery compromise and portal vein obstruction and thrombosis, impairedblood flow through the liver, including passive congestion andcentrilobular necrosis and peliosis hepatis, hepatic vein outflowobstruction, including hepatic vein thrombosis (Budd-Chiari syndrome)and veno-occlusive disease; hepatic disease associated with pregnancy,such as preeclampsia and eclampsia, acute fatty liver of pregnancy, andintrehepatic cholestasis of pregnancy; hepatic complications of organ orbone marrow transplantation, such as drug toxicity after bone marrowtransplantation, graft-versus-host disease and liver rejection, andnonimmunologic damage to liver allografts; tumors and tumorousconditions, such as nodular hyperplasias, adenomas, and malignanttumors, including primary carcinoma of the liver and metastatic tumors.

Disorders involving the uterus and endometrium include, but are notlimited to, endometrial histology in the menstrual cycle; functionalendometrial disorders, such as anovulatory cycle, inadequate lutealphase, oral contraceptives and induced endometrial changes, andmenopausal and postmenopausal changes; inflammations, such as chronicendometritis; adenomyosis; endometriosis; endometrial polyps;endometrial hyperplasia; malignant tumors, such as carcinoma of theendometrium; mixed Müllerian and mesenchymal tumors, such as malignantmixed Müllerian tumors; tumors of the myometrium, including leiomyomas,leiomyosarcomas, and endometrial stromal tumors.

Disorders involving the brain include, but are not limited to, disordersinvolving neurons, and disorders involving glia, such as astrocytes,oligodendrocytes, ependymal cells, and microglia; cerebral edema, raisedintracranial pressure and herniation, and hydrocephalus; malformationsand developmental diseases, such as neural tube defects, forebrainanomalies, posterior fossa anomalies, and syringomyelia and hydromyelia;perinatal brain injury; cerebrovascular diseases, such as those relatedto hypoxia, ischemia, and infarction, including hypotension,hypoperfusion, and low-flow states—global cerebral ischemia and focalcerebral ischemia—infarction from obstruction of local blood supply,intracranial hemorrhage, including intracerebral (intraparenchymal)hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, andvascular malformations, hypertensive cerebrovascular disease, includinglacunar infarcts, slit hemorrhages, and hypertensive encephalopathy;infections, such as acute meningitis, including acute pyogenic(bacterial) meningitis and acute aseptic (viral) meningitis, acute focalsuppurative infections, including brain abscess, subdural empyema, andextradural abscess, chronic bacterial meningoencephalitis, includingtuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis(Lyme disease), viral meningoencephalitis, including arthropod-borne(Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplexvirus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus,poliomyelitis, rabies, and human immunodeficiency virus 1, includingHIV-1 meningoencephalitis (subacute encephalitis), vacuolar myclopathy,AIDS-associated myopathy, peripheral neuropathy, and AIDS in children,progressive multifocal leukoencephalopathy, subacute sclerosingpanencephalitis, fungal meningoencephalitis, other infectious diseasesof the nervous system; transmissible spongiform encephalopathies (priondiseases); demyclinating diseases, including multiple sclerosis,multiple sclerosis variants, acute disseminated encephalomyclitis andacute necrotizing hemorrhagic encephalomyelitis, and other diseases withdemyelination; degenerative diseases, such as degenerative diseasesaffecting the cerebral cortex, including Alzheimer disease and Pickdisease, degenerative diseases of basal ganglia and brain stem,including Parkinsonism, idiopathic Parkinson disease (paralysisagitans), progressive supranuclear palsy, corticobasal degenration,multiple system atrophy, including striatonigral degenration, Shy-Dragersyndrome, and olivopontocerebellar atrophy, and Huntington disease;spinocerebellar degenerations, including spinocerebellar ataxias,including Friedreich ataxia, and ataxia-telanglectasia, degenerativediseases affecting motor neurons, including amyotrophic lateralsclerosis (motor neuron disease), bulbospinal atrophy (Kennedysyndrome), and spinal muscular atrophy; inborn errors of metabolism,such as leukodystrophies, including Krabbe disease, metachromaticleukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, andCanavan disease, mitochondrial encephalomyopathies, including Leighdisease and other mitochondrial encephalomyopathies; toxic and acquiredmetabolic diseases, including vitamin deficiencies such as thiamine(vitamin B₁) deficiency and vitamin B₁₂ deficiency, neurologic sequelaeof metabolic disturbances, including hypoglycemia, hyperglycemia, andhepatic encephatopathy, toxic disorders, including carbon monoxide,methanol, ethanol, and radiation, including combined methotrexate andradiation-induced injury; tumors, such as gliomas, includingastrocytoma, including fibrillary (diffuse) astrocytoma and glioblastomamultiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, andbrain stem glioma, oligodendroglioma, and ependymoma and relatedparaventricular mass lesions, neuronal tumors, poorly differentiatedneoplasms, including medulloblastoma, other parenchymal tumors,including primary brain lymphoma, germ cell tumors, and pinealparenchymal tumors, meningiomas, metastatic tumors, paraneoplasticsyndromes, peripheral nerve sheath tumors, including schwannoma,neurofibroma, and malignant peripheral nerve sheath tumor (malignantschwannoma), and neurocutaneous syndromes (phakomatoses), includingneurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindaudisease.

Disorders involving T-cells include, but are not limited to,cell-mediated hypersensitivity, such as delayed type hypersensitivityand T-cell-mediated cytotoxicity, and transplant rejection; autoimmunediseases, such as systemic lupus erythematosus, Sjögren syndrome,systemic sclerosis, inflammatory myopathies, mixed connective tissuedisease, and polyarteritis nodosa and other vasculitides; immunologicdeficiency syndromes, including but not limited to, primaryimmunodeficiencies, such as thymic hypoplasia, severe combinedimmunodeficiency diseases, and AIDS; leukopenia; reactive (inflammatory)proliferations of white cells, including but not limited to,leukocytosis, acute nonspecific lymphadenitis, and chronic nonspecificlymphadenitis; neoplastic proliferations of white cells, including butnot limited to lymphoid neoplasms, such as precursor T-cell neoplasms,such as acute lymphoblastic leukemia/lymphoma, peripheral T-cell andnatural killer cell neoplasms that include peripheral T-cell lymphoma,unspecified, adult T-cell leukemia/lymphoma, mycosis fungoides andSezary syndrome, and Hodgkin disease.

Diseases of the skin, include but are not limited to, disorders ofpigmentation and melanocytes, including but not limited to, vitiligo,freckle, melasma, lentigo, nevocellular nevus, dysplastic nevi, andmalignant melanoma; benign epithelial tumors, including but not limitedto, seborrheic keratoses, acanthosis nigricans, fibroepithelial polyp,epithelial cyst, keratoacanthoma, and adnexal (appendage) tumors;premalignant and malignant epidermal tumors, including but not limitedto, actinic keratosis, squamous cell carcinoma, basal cell carcinoma,and merkel cell carcinoma; tumors of the dermis, including but notlimited to, benign fibrous histiocytoma, dermatofibrosarcomaprotuberans, xanthomas, and dermal vascular tumors; tumors of cellularimmigrants to the skin, including but not limited to, histiocytosis X,mycosis fungoides (cutaneous T-cell lymphoma), and mastocytosis;disorders of epidermal maturation, including but not limited to,ichthyosis; acute inflammatory dermatoses, including but not limited to,urticaria, acute eczematous dermatitis, and erythema multiforme; chronicinflammatory dermatoses, including but not limited to, psoriasis, lichenplanus, and lupus erythematosus; blistering (bullous) diseases,including but not limited to, pemphigus, bullous pemphigoid, dermatitisherpetiformis, and noninflammatory blistering diseases: epidermolysisbullosa and porphyria; disorders of epidermal appendages, including butnot limited to, acne vulgaris; panniculitis, including but not limitedto, erythema nodosum and erythema induratum; and infection andinfestation, such as verrucae, molluscum contagiosum, impetigo,superficial fungal infections, and arthropod bites, stings, andinfestations.

In normal bone marrow, the myelocyiic series (polymorphoneuclear cells)make up approximately 60% of the cellular elements, and the erythrocyticseries, 20-30%. Lymphocytes, monocytes, reticular cells, plasma cellsand megakaryocytes together constitute 10-20%. Lymphocytes make up 5-15%of normal adult marrow. In the bone marrow, cell types are add mixed sothat precursors of red blood cells (erythroblasts), macrophages(monoblasts), platelets (megakaryocytes), polymorphoneuclear leucocytes(myeloblasts), and lymphocytes (lymphoblasts) can be visible in onemicroscopic field. In addition, stem cells exist for the different celllineages, as well as a precursor stem cell for the committed progenitorcells of the different lineages. The various types of cells and stagesof each would be known to the person of ordinary skill in the art andare found, for example, on page 42 (FIG. 2-8) of Immunology,Immunopathology and Immunity, Fifth Edition, Sell et al. Simon andSchuster (1996), incorporated by reference for its teaching of celltypes found in the bone marrow. According, the invention is directed todisorders arising from these cells. These disorders include but are notlimited to the following: diseases involving hematopoeitic stem cells;committed lymphoid progenitor cells; lymphoid cells including B andT-cells; committed myeloid progenitors, including monocytes,granulocytes, and megakaryocytes; and committed erythroid progenitors.These include but are not limited to the leukemias, including B-lymphoidleukemias, T-lymphoid leukemias, undifferentiated leukemias;erythroleukemia, megakaryoblastic leukemia, monocytic; [leukemias areencompassed with and without differentiation]; chronic and acutelymphoblastic leukemia, chronic and acute lymphocytic leukemia, chronicand acute myelogenous leukemia, lymphoma, myelo dysplastic syndrome,chronic and acute myeloid leukemia, myelomonocytic leukemia; chronic andacute myeloblastic leukemia, chronic and acute myelogenous leukemia,chronic and acute promyelocytic leukemia, chronic and acute myelocyticleukemia, hematologic malignancies of monocyte-macrophage lineage, suchas juvenile chronic myelogenous leukemia; secondary AML, antecedenthematological disorder; refractory anemia; aplastic anemia; reactivecutaneous angioendotheliomatosis; fibrosing disorders involving alteredexpression in dendritic cells, disorders including systemic sclerosis,E-M syndrome, epidemic toxic oil syndrome, eosinophilic fasciitislocalized forms of scleroderma, keloid, and fibrosing colonopathy;angiomatoid malignant fibrous histiocytoma; carcinoma, including primaryhead and neck squamous cell carcinoma; sarcoma, including kaposi'ssarcoma; fibroadanoma and phyllodes tumors, including mammaryfibroadenoma; stromal tumors; phyllodes tumors, including histiocytoma;erythroblastosis; neurofibromatosis; diseases of the vascularendothelium; demyelinating, particularly in old lesions; gliosis,vasogenic edema, vascular disease, Alzheimer's and Parkinson's disease;T-cell lymphomas; B-cell lymphomas.

Disorders involving the heart, include but are not limited to, heartfailure, including but not limited to, cardiac hypertrophy, left-sidedheart failure, and right-sided heart failure; ischemic heart disease,including but not limited to angina pectoris, myocardial infarction,chronic ischemic heart disease, and sudden cardiac death; hypertensiveheart disease, including but not limited to, systemic (left-sided)hypertensive heart disease and pulmonary (right-sided) hypertensiveheart disease; valvular heart disease, including but not limited to,valvular degeneration caused by calcification, such as calcific aorticstenosis, calcification of a congenitally bicuspid aortic valve, andmitral annular calcification, and myxomatous degeneration of the mitralvalve (mitral valve prolapse), rheumatic fever and rheumatic heartdisease, infective endocarditis, and noninfected vegetations, such asnonbacterial thrombotic endocarditis and endocarditis of systemic lupuserythematosus (Libman-Sacks disease), carcinoid heart disease, andcomplications of artificial valves; myocardial disease, including butnot limited to dilated cardiomyopathy, hypertrophic cardiomyopathy,restrictive cardiomyopathy, and myocarditis; pericardial disease,including but not limited to, pericardial effusion and hemopericardiumand pericarditis, including acute pericarditis and healed pericarditis,and rheumatoid heart disease; neoplastic heart disease, including butnot limited to, primary cardiac tumors, such as myxoma, lipoma,papillary fibroelastoma, rhabdomyoma, and sarcoma, and cardiac effectsof noncardiac neoplasms; congenital heart disease, including but notlimited to, left-to-right shunts—late cyanosis, such as atrial septaldefect, ventricular septal defect, patent ductus arteriosus, andatrioventricular septal defect, right-to-left shunts—early cyanosis,such as tetralogy of fallot, transposition of great arteries, truncusarteriosus, tricuspid atresia, and total anomalous pulmonary venousconnection, obstructive congenital anomalies, such as coarctation ofaorta, pulmonary stenosis and atresia, and aortic stenosis and atresia,and disorders involving cardiac transplantation.

Disorders involving blood vessels include, but are not limited to,responses of vascular cell walls to injury, such as endothelialdysfunction and endothelial activation and intimal thickening; vasculardiseases including, but not limited to, congenital anomalies, such asarteriovenous fistula, atherosclerosis, and hypertensive vasculardisease, such as hypertension; inflammatory disease—the vasculitides,such as giant cell (temporal) arteritis, Takayasu arteritis,polyarteritis nodosa (classic), Kawasaki syndrome (mucocutaneous lymphnode syndrome), microscopic polyanglitis (microscopic polyarteritis,hypersensitivity or leukocytoclastic anglitis), Wegener granulomatosis,thromboanglitis obliterans (Buerger disease), vasculitis associated withother disorders, and infectious arteritis; Raynaud disease; aneurysmsand dissection, such as abdominal aortic aneurysms, syphilitic (luetic)aneurysms, and aortic dissection (dissecting hematoma); disorders ofveins and lymphatics, such as varicose veins, thrombophlebitis andphlebothrombosis, obstruction of superior vena cava (superior vena cavasyndrome), obstruction of inferior vena cava (inferior vena cavasyndrome), and lymphangitis and lymphedema; tumors, including benigntumors and tumor-like conditions, such as hemangioma, lymphangioma,glomus tumor (glomangioma), vascular ectasias, and bacillaryangiomatosis, and intermediate-grade (borderline low-grade malignant)tumors, such as Kaposi sarcoma and hemangloendothelioma, and malignanttumors, such as angiosarcoma and hemangiopericytoma; and pathology oftherapeutic interventions in vascular disease, such as balloonangioplasty and related techniques and vascular replacement, such ascoronary artery bypass graft surgery.

Disorders involving red cells include, but are not limited to, anemias,such as hemolytic anemias, including hereditary spherocytosis, hemolyticdisease due to erythrocyte enzyme defects: glucose-6-phosphatedehydrogenase deficiency, sickle cell disease, thalassemia syndromes,paroxysmal nocturnal hemoglobinuria, immunohemolytic anemia, andhemolytic anemia resulting from trauma to red cells; and anemias ofdiminished erythropoiesis, including megaloblastic anemias, such asanemias of vitamin B12 deficiency: pernicious anemia, and anemia offolate deficiency, iron deficiency anemia, anemia of chronic disease,aplastic anemia, pure red cell aplasia, and other forms of marrowfailure.

Disorders involving the thymus include developmental disorders, such asDiGeorge syndrome with thymic hypoplasia or aplasia; thymic cysts;thymic hypoplasia, which involves the appearance of lymphoid follicleswithin the thymus, creating thymic follicular hyperplasia; and thymomas,including germ cell tumors, lymphomas, Hodgkin disease, and carcinoids.Thymomas can include benign or encapsulated thymoma, and malignantthymoma Type I (invasive thymoma) or Type II, designated thymiccarcinoma.

Disorders involving B-cells include, but are not limited to precursorB-cell neoplasms, such as lymphoblastic leukemia/lymphoma. PeripheralB-cell neoplasms include, but are not limited to, chronic lymphocyticleukemia/small lymphocytic lymphoma, follicular lymphoma, diffuse largeB-cell lymphoma, Burkitt lymphoma, plasma cell neoplasms, multiplemyeloma, and related entities, lymphoplasmacytic lymphoma (Waldenströmmacroglobulinemia), mantle cell lymphoma, marginal zone lymphoma(MALToma), and hairy cell leukemia.

Disorders involving the kidney include, but are not limited to,congenital anomalies including, but not limited to, cystic diseases ofthe kidney, that include but are not limited to, cystic renal dysplasia,autosomal dominant (adult) polycystic kidney disease, autosomalrecessive (childhood) polycystic kidney disease, and cystic diseases ofrenal medulla, which include, but are not limited to, medullary spongekidney, and nephronophthisis-uremic medullary cystic disease complex,acquired (dialysis-associated) cystic disease, such as simple cysts;glomerular diseases including pathologies of glomerular injury thatinclude, but are not limited to, in situ immune complex deposition, thatincludes, but is not limited to, anti-GBM nephritis, Heymann nephritis,and antibodies against planted antigens, circulating immune complexnephritis, antibodies to glomerular cells, cell-mediated immunity inglomerulonephritis, activation of alternative complement pathway,epithelial cell injury, and pathologies involving mediators ofglomerular injury including cellular and soluble mediators, acuteglomerulonephritis, such as acute proliferative (poststreptococcal,postinfectious) glomerulonephritis, including but not limited to,poststreptococcal glomerulonephritis and nonstreptococcal acuteglomerulonephritis, rapidly progressive (crescentic) glomerulonephritis,nephrotic syndrome, membranous glomerulonephritis (membranousnephropathy), minimal change disease (lipoid nephrosis), focal segmentalglomerulosclerosis, membranoproliferative glomerulonephritis, IgAnephropathy (Berger disease), focal proliferative and necrotizingglomerulonephritis (focal glomerulonephritis), hereditary nephritis,including but not limited to, Alport syndrome and thin membrane disease(benign familial hematuria), chronic glomerulonephritis, glomerularlesions associated with systemic disease, including but not limited to,systemic lupus crythematosus, Henoch-Schönlein purpura, bacterialendocarditis, diabetic glomerulosclerosis, amyloidosis, fibrillary andimmunotactoid glomerulonephritis, and other systemic disorders; diseasesaffecting tubules and interstitium, including acute tubular necrosis andtubulointerstitial nephritis, including but not limited to,pyelonephritis and urinary tract infection, acute pyelonephritis,chronic pyelonephritis and reflux nephropathy, and tubulointerstitialnephritis induced by drugs and toxins, including but not limited to,acute drug-induced interstitial nephritis, analgesic abuse nephropathy,nephropathy associated with nonsteroidal anti-inflammatory drugs, andother tubulointerstitial diseases including, but not limited to, uratenephropathy, hypercalcemia and nephrocalcinosis, and multiple myeloma;diseases of blood vessels including benign nephrosclerosis, malignanthypertension and accelerated nephrosclerosis, renal artery stenosis, andthrombotic microangiopathies including, but not limited to, classic(childhood) hemolytic-uremic syndrome, adult hemolytic-uremicsyndrome/thrombotic thrombocytopenic purpura, idiopathic HUS/TTP, andother vascular disorders including, but not limited to, atheroscleroticischemic renal disease, atheroembolic renal disease, sickle cell diseasenephropathy, diffuse cortical necrosis, and renal infarcts; urinarytract obstruction (obstructive uropathy); urolithiasis (renal calculi,stones); and tumors of the kidney including, but not limited to, benigntumors, such as renal papillary adenoma, renal fibroma or hamartoma(renomedullary interstitial cell tumor), angiomyolipoma, and oncocytoma,and malignant tumors, including renal cell carcinoma (hypemephroma,adenocarcinoma of kidney), which includes urothelial carcinomas of renalpelvis.

Disorders of the breast include, but are not limited to, disorders ofdevelopment; inflammations, including but not limited to, acutemastitis, periductal mastitis, periductal mastitis (recurrent subareolarabscess, squamous metaplasia of lactiferous ducts), mammary ductectasia, fat necrosis, granulomatous mastitis, and pathologiesassociated with silicone breast implants; fibrocystic changes;proliferative breast disease including, but not limited to, epithelialhyperplasia, sclerosing adenosis, and small duct papillomas; tumorsincluding, but not limited to, stromal tumors such as fibroadenoma,phyllodes tumor, and sarcomas, and epithelial tumors such as large ductpapilloma; carcinoma of the breast including in situ (noninvasive)carcinoma that includes ductal carcinoma in situ (including Paget'sdisease) and lobular carcinoma in situ, and invasive (infiltrating)carcinoma including, but not limited to, invasive ductal carcinoma, nospecial type, invasive lobular carcinoma, medullary carcinoma, colloid(mucinous) carcinoma, tubular carcinoma, and invasive papillarycarcinoma, and miscellaneous malignant neoplasms.

Disorders in the male breast include, but are not limited to,gynecomastia and carcinoma.

Disorders involving the testis and epididymis include, but are notlimited to, congenital anomalies such as cryptorchidism, regressivechanges such as atrophy, inflammations such as nonspecific epididymitisand orchitis, granulomatous (autoimmune) orchitis, and specificinflammations including, but not limited to, gonorrhea, mumps,tuberculosis, and syphilis, vascular disturbances including torsion,testicular tumors including germ cell tumors that include, but are notlimited to, seminoma, spermatocytic seminoma, embryonal carcinoma, yolksac tumor choriocarcinoma, teratoma, and mixed tumors, tumore of sexcord-gonadal stroma including, but not limited to, leydig (interstitial)cell tumors and sertoli cell tumors (androblastoma), and testicularlymphoma, and miscellaneous lesions of tunica vaginalis.

Disorders involving the prostate include, but are not limited to,inflammations, benign enlargement, for example, nodular hyperplasia(benign prostatic hypertrophy or hyperplasia), and tumors such ascarcinoma.

Disorders involving the thyroid include, but are not limited to,hyperthyroidism; hypothyroidism including, but not limited to, cretinismand myxedema; thyroiditis including, but not limited to, hashimotothyroiditis, subacute (granulomatous) thyroiditis, and subacutelymphocytic (painless) thyroiditis; Graves disease; diffuse andmultinodular goiter including, but not limited to, diffuse nontoxic(simple) goiter and multinodular goiter; neoplasms of the thyroidincluding, but not limited to, adenomas, other benign tumors, andcarcinomas, which include, but are not limited to, papillary carcinoma,follicular carcinoma, medullary carcinoma, and anaplastic carcinoma; andcogenital anomalies.

Disorders involving the skeletal muscle include tumors such asrhabdomyosarcoma.

Disorders involving the pancreas include those of the exocrine pancreassuch as congenital anomalies, including but not limited to, ectopicpancreas; pancreatitis, including but not limited to, acutepancreatitis; cysts, including but not limited to, pseudocysts; tumors,including but not limited to, cystic tumors and carcinoma of thepancreas; and disorders of the endocrine pancreas such as, diabetesmellitus; islet cell tumors, including but not limited to, insulinomas,gastrinomas, and other rare islet cell tumors.

Disorders involving the small intestine include the malabsorptionsyndromes such as, celiac sprue, tropical sprue (postinfectious sprue),whipple disease, disaccharidase (lactase) deficiency,abetalipoproteinemia, and tumors of the small intestine includingadenomas and adenocarcinoma.

Disorders related to reduced platelet number, thrombocytopenia, includeidiopathic thrombocytopenic purpura, including acute idiopathicthrombocytopenic purpura, drug-induced thrombocytopenia, HIV-associatedthrombocytopenia, and thrombotic microangiopathies: thromboticthrombocytopenic purpura and hemolytic-uremic syndrome.

Disorders involving precursor T-cell neoplasms include precursor Tlymphoblastic leukemia/lymphoma. Disorders involving peripheral T-celland natural killer cell neoplasms include T-cell chronic lymphocyticleukemia, large granular lymphocytic leukemia, mycosis fingoides andSezary syndrome, peripheral T-cell lymphoma, unspecified,angioimmunoblastic T-cell lymphoma, angiocentric lymphoma (NK/T-celllymphoma^(4a)), intestinal T-cell lymphoma, adult T-cellleukemia/lymphoma, and anaplastic large cell lymphoma.

Disorders involving the ovary include, for example, polycystic ovariandisease, Stein-leventhal syndrome, Pseudomyxoma peritonei and stromalhyperthecosis; ovarian tumors such as, tumors of coelomic epithelium,serous tumors, mucinous tumors, endometeriod tumors, clear celladenocarcinoma, cystadenofibroma, brenner tumor, surface epithelialtumors; germ cell tumors such as mature (benign) teratomas, monodermalteratomas, immature malignant teratomas, dysgerminoma, endodermal sinustumor, choriocarcinoma; sex cord-stomal tumors such as, granulosa-thecacell tumors, thecoma-fibromas, androblastomas, hill cell tumors, andgonadoblastoma; and metastatic tumors such as Krukenberg tumors.

Bone-forming cells include the osteoprogenitor cells, osteoblasts, andosteocytes. The disorders of the bone are complex because they may havean impact on the skeleton during any of its stages of development.Hence, the disorders may have variable manifestations and may involveone, multiple or all bones of the body. Such disorders include,congenital malformations, achondroplasia and thanatophoric dwarfism,diseases associated with abnormal matix such as type 1 collagen disease,osteoporosis, Paget disease, rickets, osteomalacia, high-turnoverosteodystrophy, low-turnover of aplastic disease, osteonecrosis,pyogenic osteomyelitis, tuberculous osteomyelitism, osteoma, osteoidosteoma, osteoblastoma, osteosarcoma, osteochondroma, chondromas,chondroblastoma, chondromyxoid fibroma, chondrosarcoma, fibrous corticaldefects, fibrous dysplasia, fibrosarcoma, malignant fibroushistiocytoma, Ewing sarcoma, primitive neuroectodermal tumor, giant celltumor, and metastatic tumors.

Disorders in which the ADH expression is relevant include, but are notlimited to, drug/alcohol interactions, susceptibility to alcoholism,alcohol-induced organ injury such as alcoholic liver cirrhosis,first-pass metabolism of alcohol, fetal alcohol syndrome, andalcohol-related cancers including, but not limited to cancers of theesophagus, oral cavity, upper gastrointestinal tract and colorectum.Furthermore, ADH expression is also relevant to alcohol-inducedflushing. Alcohol-induced flushing is characterized by the rapid onsetof skin vasodilation of the face, neck and chest regions afterconsumption of small amounts of alcohol. Tachycardia, headache, nausea,hypotension, and extreme drowsiness are also common symptoms ofalcohol-induced flushing. Flush reactions have been correlated with adeficiency or absence of the ADH2 enzyme activity. ADH expression isalso relevant in the pathogenesis of male sterility and skin diseases,such as psoriasis. Oxidoreductases have also been implicated in thepathophysiology of neurodegenerative disorders and apoptotic processesrelated to diseases such as Alzheimer's disease.

Treatment is defined as the application or administration of atherapeutic agent to a patient, or application or administration of atherapeutic agent to an isolated tissue or cell line from a patient, whohas a disease, a symptom of disease or a predisposition toward adisease, with the purpose to cure, heal, alleviate, relieve, alter,remedy, ameliorate, improve or affect the disease, the symptoms ofdisease or the predisposition toward disease.

A therapeutic agent includes, but is not limited to, small molecules,peptides, antibodies, ribozymes and antisense oligonucleotides.

The ADH polypeptides are thus useful for treating an ADH-associateddisorder characterized by aberrant expression or activity of an ADH. Inone embodiment, the method involves administering an agent (e.g., anagent identified by a screening assay described herein), or combinationof agents that modulates (e.g., upregulates or downregulates) expressionor activity of the protein. In another embodiment, the method involvesadministering the ADH as therapy to compensate for reduced or aberrantexpression or activity of the protein.

Methods for treatment include but are not limited to the use of solubleADH or fragments of the ADH protein that compete for substrate orcoenzyme binding, interfere with subunit interaction, or interfere withthe reaction mediated by the ADH polypeptide. These ADHs or fragmentscan have a higher affinity for the target so as to provide effectivecompetition.

Stimulation of activity is desirable in situations in which the proteinis abnormally downregulated and/or in which increased activity is likelyto have a beneficial effect. Likewise, inhibition of activity isdesirable in situations in which the protein is abnormally upregulatedand/or in which decreased activity is likely to have a beneficialeffect. In one example of such a situation, a subject has a disordercharacterized by aberrant development or cellular differentiation. Inanother example, the subject has a proliferative disease (e.g., cancer).In another example, the subject has a disorder mediated by an alteredNADH/NAD⁺ redox potential, as described herein.

In yet another aspect of the invention, the proteins of the inventioncan be used as “bait proteins” in a two-hybrid assay or three-hybridassay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartelet al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene8:1693-1696; and Brent WO 94/10300), to identify other proteins(captured proteins) which bind to or interact with the proteins of theinvention and modulate their activity.

The ADH polypeptides also are useful to provide a target for diagnosinga disease or predisposition to disease mediated by the ADH, including,but not limited to, diseases involving tissues in which the ADHs areexpressed as disclosed herein, and particularly in breast, lung, colon,and liver metastases derived from malignant colon tissue. Accordingly,methods are provided for detecting the presence, or levels of, the ADHin a cell, tissue, or organism. The method involves contacting abiological sample with a compound capable of interacting with the ADHsuch that the interaction can be detected.

One agent for detecting ADH is an antibody capable of selectivelybinding to ADH. A biological sample includes tissues, cells andbiological fluids isolated from a subject, as well as tissues, cells andfluids present within a subject.

The ADH also provides a target for diagnosing active disease, orpredisposition to disease, in a patient having a variant ADH. Thus, ADHcan be isolated from a biological sample and assayed for the presence ofa genetic mutation that results in an aberrant protein. This includesamino acid substitution, deletion, insertion, rearrangement, (as theresult of aberrant splicing events), and inappropriatepost-translational modification. Analytic methods include alteredelectrophoretic mobility, altered tryptic peptide digest, altered ADHactivity in cell-based or cell-free assay, alteration in substrate orcoenzyme binding, altered interaction with ADH subunits, altered rate ofsubstrate oxidation/reduction, altered antibody-binding pattern, alteredisoelectric point, direct amino acid sequencing, and any other of theknown assay techniques useful for detecting mutations in a protein ingeneral or in an ADH specifically.

In vitro techniques for detection of ADH include enzyme linkedimmunosorbent assays (ELISAs), Western blots, immunoprecipitations andimmunofluorescence. Alternatively, the protein can be detected in vivoin a subject by introducing into the subject a labeled anti-ADHantibody. For example, the antibody can be labeled with a radioactivemarker whose presence and location in a subject can be detected bystandard imaging techniques. Particularly useful are methods, whichdetect the allelic variant of the ADH expressed in a subject, andmethods, which detect fragments of the ADH in a sample.

The ADH polypeptides are also useful in pharmacogenomic analysis.Pharmacogenomics deal with clinically significant hereditary variationsin the response to drugs due to altered drug disposition and abnormalaction in affected persons. See, e.g., Eichelbaum, M. (1996) Clin. Exp.Pharmacol. Physiol. 23(10-11):983-985, and Linder, M. W. (1997) Clin.Chem. 43(2):254-266. The clinical outcomes of these variations result insevere toxicity of therapeutic drugs in certain individuals ortherapeutic failure of drugs in certain individuals as a result ofindividual variation in metabolism. Thus, the genotype of the individualcan determine the way a therapeutic compound acts on the body or the waythe body metabolizes the compound. Further, the activity of drugmetabolizing enzymes affects both the intensity and duration of drugaction. Thus, the pharmacogenomics of the individual permit theselection of effective compounds and effective dosages of such compoundsfor prophylactic or therapeutic treatment based on the individual'sgenotype. The discovery of genetic polymorphisms in some drugmetabolizing enzymes has explained why some patients do not obtain theexpected drug effects, show an exaggerated drug effect, or experienceserious toxicity from standard drug dosages. Polymorphisms can beexpressed in the phenotype of the extensive metabolizer and thephenotype of the poor metabolizer. Accordingly, genetic polymorphism maylead to allelic protein variants of the ADH in which one or more of theADH functions in one population is different from those in anotherpopulation. The polypeptides thus allow a target to ascertain a geneticpredisposition that can affect treatment modality. Thus, in an ADH-basedtreatment, polymorphism may give rise to catalytic regions that are moreor less active. Accordingly, dosage would necessarily be modified tomaximize the therapeutic effect within a given population containing thepolymorphism. As an alternative to genotyping, specific polymorphicpolypeptides could be identified.

The ADH polypeptides are also useful for monitoring therapeutic effectsduring clinical trials and other treatment. Thus, the therapeuticeffectiveness of an agent that is designed to increase or decrease geneexpression, protein levels or ADH activity can be monitored over thecourse of treatment using the ADH polypeptides as an end-point target.The monitoring can be, for example, as follows: (i) obtaining apre-administration sample from a subject prior to administration of theagent; (ii) detecting the level of expression or activity of the proteinin the pre-administration sample; (iii) obtaining one or morepost-administration samples from the subject; (iv) detecting the levelof expression or activity of the protein in the post-administrationsamples; (v) comparing the level of expression or activity of theprotein in the pre-administration sample with the protein in thepost-administration sample or samples; and (vi) increasing or decreasingthe administration of the agent to the subject accordingly.

Antibodies

The invention also provides antibodies that selectively bind to the ADHand its variants and fragments. An antibody is considered to selectivelybind, even if it also binds to other proteins that are not substantiallyhomologous with the ADH. These other proteins share homology with afragment or domain of the ADH. This conservation in specific regionsgives rise to antibodies that bind to both proteins by virtue of thehomologous sequence. In this case, it would be understood that antibodybinding to the ADH is still selective.

To generate antibodies, an isolated ADH polypeptide is used as animmunogen to generate antibodies using standard techniques forpolyclonal and monoclonal antibody preparation. Either the full-lengthprotein or antigenic peptide fragment can be used. Regions having a highantigenicity index are shown in FIGS. 8, 14, 18, 22, 26 and 30.

Antibodies are preferably prepared from these regions or from discretefragments in these regions. However, antibodies can be prepared from anyregion of the peptide as described herein. A preferred fragment producesan antibody that diminishes or completely prevents substrate or coenzymebinding or prevents the oxidation of substrate. Antibodies can bedeveloped against the entire ADH or domains of the ADH as describedherein. Antibodies can also be developed against specific functionalsites as disclosed herein.

The antigenic peptide can comprise a contiguous sequence of at least 12,14, 15, or amino acid residues. In one embodiment, fragments correspondto regions that are located on the surface of the protein, e.g.,hydrophilic regions. These fragments are not to be construed, however,as encompassing any fragments, which may be disclosed prior to theinvention.

Antibodies can be polyclonal or monoclonal. An intact antibody, or afragment thereof (e.g., Fab or F(ab′)₂) can be used.

Detection can be facilitated by coupling (i.e., physically linking) theantibody to a detectable substance. Examples of detectable substancesinclude various enzymes, prosthetic groups, fluorescent materials,luminescent materials, bioluminescent materials, and radioactivematerials. Examples of suitable enzymes include horseradish peroxidase,alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examplesof suitable prosthetic group complexes include streptavidin/biotin andavidin/biotin; examples of suitable fluorescent materials includeumbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine,dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; anexample of a luminescent material includes luminol; examples ofbioluminescent materials include luciferase, luciferin, and aequorin,and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or³H.

An appropriate immunogenic preparation can be derived from native,recombinantly expressed, or chemically synthesized peptides.

Antibody Uses

The antibodies can be used to isolate an ADH by standard techniques,such as affinity chromatography or immunoprecipitation. The antibodiescan facilitate the purification of the natural ADH from cells andrecombinantly produced ADH expressed in host cells.

The antibodies are useful to detect the presence of ADH in cells ortissues to determine the pattern of expression of the ADH among varioustissues in an organism and over the course of normal development.

The antibodies can be used to detect ADH in situ, in vitro, or in a celllysate or supernatant in order to evaluate the abundance and pattern ofexpression.

The antibodies can be used to assess abnormal tissue distribution orabnormal expression during development.

Antibody detection of circulating fragments of the full length ADH canbe used to identify ADH turnover.

Further, the antibodies can be used to assess ADH expression in diseasestates such as in active stages of the disease or in an individual witha predisposition toward disease related to ADH function. When a disorderis caused by an inappropriate tissue distribution, developmentalexpression, or level of expression of the ADH protein, the antibody canbe prepared against the normal ADH protein. If a disorder ischaracterized by a specific mutation in the ADH, antibodies specific forthis mutant protein can be used to assay for the presence of thespecific mutant ADH. However, intracellularly-made antibodies(“intrabodies”) are also encompassed, which would recognizeintracellular ADH peptide regions.

The antibodies can also be used to assess normal and aberrantsubcellular localization of cells in the various tissues in an organism.Antibodies can be developed against the whole ADH or portions of theADH.

The diagnostic uses can be applied, not only in genetic testing, butalso in monitoring a treatment modality. Accordingly, where treatment isultimately aimed at correcting ADH expression level or the presence ofaberrant ADHs and aberrant tissue distribution or developmentalexpression, antibodies directed against the ADH or relevant fragmentscan be used to monitor therapeutic efficacy.

Antibodies accordingly can be used diagnostically to monitor proteinlevels in tissue as part of a clinical testing procedure, e.g., to, forexample, determine the efficacy of a given treatment regimen.

Additionally, antibodies are useful in pharmacogenomic analysis. Thus,antibodies prepared against polymorphic ADH can be used to identifyindividuals that require modified treatment modalities.

The antibodies are also useful as diagnostic tools as an immunologicalmarker for aberrant ADH analyzed by electrophoretic mobility,isoelectric point, tryptic peptide digest, and other physical assaysknown to those in the art.

The antibodies are also useful for tissue typing. Thus, where a specificADH has been correlated with expression in a specific tissue, antibodiesthat are specific for this ADH can be used to identify a tissue type.

The antibodies are also useful in forensic identification. Accordingly,where an individual has been correlated with a specific geneticpolymorphism resulting in a specific polymorphic protein, an antibodyspecific for the polymorphic protein can be used as an aid inidentification.

The antibodies are also useful for inhibiting ADH function, for example,blocking substrate or coenzyme binding or disrupting theoxidation/reduction of substrate.

These uses can also be applied in a therapeutic context in whichtreatment involves inhibiting ADH function. An antibody can be used, forexample, to block coenzyme or substrate binding. Antibodies can beprepared against specific fragments containing sites required forfunction or against intact ADH associated with a cell.

Completely human antibodies are particularly desirable for therapeutictreatment of human patients. For an overview of this technology forproducing human antibodies, see Lonberg et al. (1995) Int. Rev. Immunol.13:65-93. For a detailed discussion of this technology for producinghuman antibodies and human monoclonal antibodies and protocols forproducing such antibodies, e.g., U.S. Pat. No. 5,625,126; U.S. Pat. No.5,633,425; U.S. Pat. No. 5,569,825; U.S. Pat. No. 5,661,016; and U.S.Pat. No. 5,545,806.

The invention also encompasses kits for using antibodies to detect thepresence of an ADH protein in a biological sample. The kit can compriseantibodies such as a labeled or labelable antibody and a compound oragent for detecting ADH in a biological sample; means for determiningthe amount of ADH in the sample; and means for comparing the amount ofADH in the sample with a standard. The compound or agent can be packagedin a suitable container. The kit can further comprise instructions forusing the kit to detect ADH.

Polynucleotides

The nucleotide sequences in SEQ ID NOS:6, 8, 10, 12, and 14, wereobtained by sequencing the deposited human cDNA. Accordingly, thesequence of the deposited clones are controlling as to any discrepanciesbetween the two and any reference to the sequences of SEQ ID NOS:6, 8,10, 12, and 14, includes reference to the sequences of the depositedcDNAs.

The specifically disclosed cDNAs comprise the coding region and 5′ and3′ untranslated sequences in SEQ ID NOS:6, 8, 10, 12, and 14.

The invention provides isolated polynucleotides encoding the novel ADHs.The term “ADH polynucleotide” or “ADH nucleic acid” refers to thesequences shown in SEQ ID NOS:6, 8, 10, 12, and 14 or in the depositedcDNAs. The term “ADH polynucleotide” or “ADH nucleic acid” furtherincludes variants and fragments of the ADH polynucleotides.

An “isolated” ADH nucleic acid is one that is separated from othernucleic acid present in the natural source of the ADH nucleic acid.Preferably, an “isolated” nucleic acid is free of sequences whichnaturally flank the ADH nucleic acid (i.e., sequences located at the 5′and 3′ ends of the nucleic acid) in the genomic DNA of the organism fromwhich the nucleic acid is derived. However, there can be some flankingnucleotide sequences, for example up to about 5 KB. The important pointis that the ADH nucleic acid is isolated from flanking sequences suchthat it can be subjected to the specific manipulations described herein,such as recombinant expression, preparation of probes and primers, andother uses specific to the ADH nucleic acid sequences.

Moreover, an “isolated” nucleic acid molecule, such as a cDNA or RNAmolecule, can be substantially free of other cellular material, orculture medium when produced by recombinant techniques, or chemicalprecursors or other chemicals when chemically synthesized. However, thenucleic acid molecule can be fused to other coding or regulatorysequences and still be considered isolated.

In some instances, the isolated material will form part of a composition(for example, a crude extract containing other substances), buffersystem or reagent mix. In other circumstances, the material may bepurified to essential homogeneity, for example as determined by PAGE orcolumn chromatography such as HPLC. Preferably, an isolated nucleic acidcomprises at least about 50, 80 or 90% (on a molar basis) of allmacromolecular species present.

For example, recombinant DNA molecules contained in a vector areconsidered isolated. Further examples of isolated DNA molecules includerecombinant DNA molecules maintained in heterologous host cells orpurified (partially or substantially) DNA molecules in solution.Isolated RNA molecules include in vivo or in vitro RNA transcripts ofthe isolated DNA molecules of the present invention. Isolated nucleicacid molecules according to the present invention further include suchmolecules produced synthetically.

In some instances, the isolated material will form part of a composition(or example, a crude extract containing other substances), buffer systemor reagent mix. In other circumstances, the material may be purified toessential homogeneity, for example as determined by PAGE or columnchromatography such as HPLC. Preferably, an isolated nucleic acidcomprises at least about 50, 80 or 90% (on a molar basis) of allmacromolecular species present.

The ADH polynucleotides can encode the mature protein plus additionalamino or carboxyterminal amino acids, or amino acids interior to themature polypeptide (when the mature form has more than one polypeptidechain, for instance). Such sequences may play a role in processing of aprotein from precursor to a mature form, facilitate protein trafficking,prolong or shorten protein half-life or facilitate manipulation of aprotein for assay or production, among other things. As generally is thecase in situ, the additional amino acids may be processed away from themature protein by cellular enzymes.

The ADH polynucleotides include, but are not limited to, the sequenceencoding the mature polypeptide alone, the sequence encoding the maturepolypeptide and additional coding sequences, such as a leader orsecretory sequence (e.g., a pre-pro or pro-protein sequence), thesequence encoding the mature polypeptide, with or without the additionalcoding sequences, plus additional non-coding sequences, for exampleintrons and non-coding 5′ and 3′ sequences such as transcribed butnon-translated sequences that play a role in transcription, mRNAprocessing (including splicing and polyadenylation signals), ribosomebinding and stability of mRNA. In addition, the polynucleotide may befused to a marker sequence encoding, for example, a peptide thatfacilitates purification.

ADH polynucleotides can be in the form of RNA, such as mRNA, or in theform DNA, including cDNA and genomic DNA obtained by cloning or producedby chemical synthetic techniques or by a combination thereof. Thenucleic acid, especially DNA, can be double-stranded or single-stranded.Single-stranded nucleic acid can be the coding strand (sense strand) orthe non-coding strand (anti-sense strand).

ADH nucleic acid can comprise the nucleotide sequences shown in SEQ IDNOS:6, 8, 10, 12, and 14, corresponding to human the 21620, 33756,21676, 21612, and 21615 ADH cDNAs, respectfully.

In one embodiment, the ADH nucleic acid comprises only the codingregion.

The invention further provides variant ADH polynucleotides, andfragments thereof, that differ from the nucleotide sequences shown inSEQ ID NOS:6, 8, 10, 12, and 14 due to degeneracy of the genetic codeand thus encode the same protein as that encoded by the nucleotidesequences shown in SEQ ID NOS:6, 8, 10, 12, and 14.

The invention also provides ADH nucleic acid molecules encoding thevariant polypeptides described herein. Such polynucleotides may benaturally occurring, such as allelic variants (same locus), homologs(different locus), and orthologs (different organism), or may beconstructed by recombinant DNA methods or by chemical synthesis. Suchnon-naturally occurring variants may be made by mutagenesis techniques,including those applied to polynucleotides, cells, or organisms.Accordingly, as discussed above, the variants can contain nucleotidesubstitutions, deletions, inversions and insertions.

Typically, variants have a substantial identity with nucleic acidmolecules of SEQ ID NOS:6, 8, 10, 12, and 14, and the complementsthereof. Variation can occur in either or both the coding and non-codingregions. The variations can produce both conservative andnon-conservative amino acid substitutions.

Orthologs, homologs, and allelic variants can be identified usingmethods well known in the art. These variants comprise a nucleotidesequence encoding an ADH that is at least about 60-65%, 65-70%,typically at least about 70-75%, more typically at least about 80-85%,and most typically at least about 90-95% or more homologous to thenucleotide sequence shown in SEQ ID NOS:6, 8, 10, 12, and 14, or afragment of this sequence. Such nucleic acid molecules can readily beidentified as being able to hybridize under stringent conditions, to thenucleotide sequence shown in SEQ ID NOS:6, 8, 10, 12, and 14 or afragment of the sequence. It is understood that stringent hybridizationdoes not indicate substantial homology where it is due to generalhomology, such as poly A sequences, or sequences common to all or mostproteins, all ADHs, or all short-chain dehydrogenase/reductases.Moreover, it is understood that variants do not include any of thenucleic acid sequences that may have been disclosed prior to theinvention.

As used herein, the term “hybridizes under stringent conditions” isintended to describe conditions for hybridization and washing underwhich nucleotide sequences encoding a polypeptide at least about 60-65%homologous to each other typically remain hybridized to each other. Theconditions can be such that sequences at least about 65%, at least about70%, at least about 75%, at least about 80%, at least about 90%, atleast about 95% or more identical to each other remain hybridized to oneanother. Such stringent conditions are known to those skilled in the artand can be found in Current Protocols in Molecular Biology, John Wiley &Sons, N.Y. (1989), 6.3.1-6.3.6, incorporated by reference. One exampleof stringent hybridization conditions are hybridization in 6× sodiumchloride/sodium citrate (SSC) at about 45° C., followed by one or morewashes in 0.2×SSC, 0.1% SDS at 50-65° C. In another non-limitingexample, nucleic acid molecules are allowed to hybridize in 6× sodiumchloride/sodium citrate (SSC) at about 45° C., followed by one or morelow stringency washes in 0.2×SSC/0.1% SDS at room temperature, or by oneor more moderate stringency washes in 0.2×SSC/0.1% SDS at 42° C., orwashed in 0.2×SSC/0.1% SDS at 65° C. for high stringency. In oneembodiment, an isolated nucleic acid molecule that hybridizes understringent conditions to the sequence of SEQ ID NO:6, 8, 10, 12, and 14corresponds to a naturally-occurring nucleic acid molecule. As usedherein, a “naturally-occurring” nucleic acid molecule refers to an RNAor DNA molecule having a nucleotide sequence that occurs in nature(e.g., encodes a natural protein).

As understood by those of ordinary skill, the exact conditions can bedetermined empirically and depend on ionic strength, temperature and theconcentration of destabilizing agents such as formamide or denaturingagents such as SDS. Other factors considered in determining the desiredhybridization conditions include the length of the nucleic acidsequences, base composition, percent mismatch between the hybridizingsequences and the frequency of occurrence of subsets of the sequenceswithin other non-identical sequences. Thus, equivalent conditions can bedetermined by varying one or more of these parameters while maintaininga similar degree of identity or similarity between the two nucleic acidmolecules.

The present invention also provides isolated nucleic acids that containa single or double stranded fragment or portion that hybridizes understringent conditions to the nucleotide sequence of SEQ ID NOS:6, 8, 10,12, and 14 or the complement of SEQ ID NOS: 6, 8, 10, 12, and 14. In oneembodiment, the nucleic acid consists of a portion of the nucleotidesequence of SEQ ID NOS: 6, 8, 10, 12, and 14, and the complement of SEQID NOS: 6, 8, 10, 12, and 14.

It is understood that isolated fragments include any contiguous sequencenot disclosed prior to the invention as well as sequences that aresubstantially the same and which are not disclosed. Accordingly, if afragment is disclosed prior to the present invention, that fragment isnot intended to be encompassed by the invention. When a sequence is notdisclosed prior to the present invention, an isolated nucleic acidfragment is at least about 12, preferably at least about 15, 18, 20, 23or 25 nucleotides, and can be 30, 40, 50, 100, 200, 500 or morenucleotides in length. Longer fragments, for example, 30 or morenucleotides in length, which encode antigenic proteins or polypeptidesdescribed herein are useful.

For the 21620 ADH, for example, nucleotide sequences from about 265 toabout 300, from about 782 to about 870, from about 1003 to about 1035,and from about 1096 to about 1158 are not disclosed prior to the presentinvention. The nucleotide sequences from about 1 to about 301encompasses fragments greater than about 125, 135, 145 or 155nucleotides; the nucleotide sequences from about 138 to about 1159encompasses fragments greater than 268, 280, 290, or 300 nucleotides;the nucleotide sequences from about 871 to about 1560 encompassesfragments greater than 265, 275, 285, or 295; and the nucleotidesequences from about 1036 to about 1877 encompasses fragments greaterthan 266, 275, 285, or 295 nucleotides.

For the 33756ADH, for example, nucleotide sequences from about 66 toabout 242 are not disclosed prior to the present invention. Thenucleotide sequences from about 1 to about 454 encompass fragmentsgreater than 21, 25, 30, or 35 nucleotides; the nucleotide sequencesfrom about 1 to about 700 encompass fragments greater than 240, 250, 260or 275 nucleotide; and the nucleotide sequences from about 1 to about1153 encompass fragments greater than 574, 580, 590 or 600 nucleotides.

For the 21676 ADH, for example, nucleotide sequences from about 1 toabout 14, from about 69 to about 94, and from about 206 to about 1699are not disclosed prior to the present invention. The nucleotidesequences from about 1 to about 206 encompasses fragments greater than20, 25, 30, 35, 40 or 45 nucleotides.

For the 21612 ADH, for example, nucleotide sequences from about 32 toabout 51, from about 679 to about 710, and from about 1525 to about 2535are not disclosed prior to the present invention. The nucleotidesequences from about 1 to about 678 encompasses fragments greater than247, 260, 270, or 280 nucleotides and the nucleotide sequences fromabout 147 to about 2535 encompasses fragments greater than 417, 425,435, 445 or 455 nucleotides.

For the 21615 ADH, for example, nucleotide sequences from about 538 toabout 1615 are not disclosed prior to the present invention. Thenucleotide sequence from about nucleotide 1 to about nucleotide 788encompasses fragments greater than 230, 240, 250 or 260 nucleotides andthe nucleotide sequence from about nucleotide 442 to about 1615encompasses fragments greater than 670, 680, 690 or 700 nucleotides.

Furthermore, the invention provides polynucleotides that comprise afragment of the full-length ADH polynucleotides. The fragment can besingle or double-stranded and can comprise DNA or RNA. The fragment canbe derived from either the coding or the non-coding sequence.

In another embodiment an isolated ADH nucleic acid encodes the entirecoding region. In another embodiment the isolated ADH nucleic acidencodes a sequence corresponding to the mature protein. For example, themature form of the 21676 ADH is from about amino acid 16 to the lastamino acid. Other fragments include nucleotide sequences encoding theamino acid fragments described herein.

Thus, ADH nucleic acid fragments further include sequences correspondingto the domains described herein, subregions also described, and specificfunctional sites. ADH nucleic acid fragments also include combinationsof the domains, segments, and other functional sites described above. Aperson of ordinary skill in the art would be aware of the manypermutations that are possible.

Where the location of the domains or sites have been predicted bycomputer analysis, one of ordinary sill would appreciate that the aminoacid residues constituting these domains can vary depending on thecriteria used to define the domains.

However, it is understood that an ADH fragment includes any nucleic acidsequence that does not include the entire gene.

The invention also provides ADH nucleic acid fragments that encodeepitope bearing regions of the ADH proteins described herein.

Nucleic acid fragments, according to the present invention, are not tobe construed as encompassing those fragments that may have beendisclosed prior to the invention.

Polynucleotide Uses

The nucleotide sequences of the present invention can be used as a“query sequence” to perform a search against public databases, forexample, to identify other family members or related sequences. Suchsearches can be performed using the NBLAST and XBLAST programs (version2.0) of Altschul et al. (1990) J. Mol. Biol. 215:403-10. BLAST proteinsearches can be performed with the XBLAST program, score=50,wordlength=3 to obtain amino acid sequences homologous to the proteinsof the invention. To obtain gapped alignments for comparison purposes,Gapped BLAST can be utilized as described in Altschul et al. (1997)Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST and GappedBLAST programs, the default parameters of the respective programs (e.g.,XBLAST and NBLAST) can be used. See www.ncbi.nlm.nih.gov.

The nucleic acid fragments of the invention provide probes or primers inassays such as those described below. “Probes” are oligonucleotides thathybridize in a base-specific manner to a complementary strand of nucleicacid. Such probes include polypeptide nucleic acids, as described inNielsen et al. (1991) Science 254:1497-1500. Typically, a probecomprises a region of nucleotide sequence that hybridizes under highlystringent conditions to at least about 15, typically about 20-25, andmore typically about 40, 50 or 75 consecutive nucleotides of the nucleicacid sequence shown in SEQ ID NOS: 6, 8, 10, 12, and 14 and thecomplements thereof. More typically, the probe further comprises alabel, e.g., radioisotope, fluorescent compound, enzyme, or enzymeco-factor.

As used herein, the term “primer” refers to a single-strandedoligonucleotide which acts as a point of initiation of template-directedDNA synthesis using well-known methods (e.g., PCR, LCR) including, butnot limited to those described herein. The appropriate length of theprimer depends on the particular use, but typically ranges from about 15to 30 nucleotides. The term “primer site” refers to the area of thetarget DNA to which a primer hybridizes. The term “primer pair” refersto a set of primers including a 5′ (upstream) primer that hybridizeswith the 5′ end of the nucleic acid sequence to be amplified and a 3′(downstream) primer that hybridizes with the complement of the sequenceto be amplified.

The ADH polynucleotides are thus useful for probes, primers, and inbiological assays.

Where the polynucleotides are used to assess ADH properties orfunctions, such as in the assays described herein, all or less than allof the entire cDNA can be useful. Assays specifically directed to ADHfunctions, such as assessing agonist or antagonist activity, encompassthe use of known fragments. Further, diagnostic methods for assessingADH function can also be practiced with any fragment, including thosefragments that may have been known prior to the invention. Similarly, inmethods involving treatment of ADH dysfunction, all fragments areencompassed including those, which may have been known in the art.

The ADH polynucleotides are useful as a hybridization probe for cDNA andgenomic DNA to isolate a full-length cDNA and genomic clones encodingthe polypeptides described in SEQ ID NOS:5, 7, 9, 11, and 13, and toisolate cDNA and genomic clones that correspond to variants producingthe same polypeptides shown in SEQ ID NOS: 5, 7, 9, 11, and 13 or theother variants described herein. Variants can be isolated from the sametissue and organism from which the polypeptides shown in SEQ ID NOS: 5,7, 9, 11, and 13, were isolated, different tissues from the sameorganism, or from different organisms. This method is useful forisolating genes and cDNA that are developmentally-controlled andtherefore may be expressed in the same tissue or different tissues atdifferent points in the development of an organism.

The probe can correspond to any sequence along the entire length of thegene encoding the ADH. Accordingly, it could be derived from 5′noncoding regions, the coding region, and 3′ noncoding regions.

The nucleic acid probe can be, for example, the full-length cDNA of SEQID NOS:6, 8, 10, 12, and 14, or a fragment thereof, such as anoligonucleotide of at least 12, 15, 30, 50, 100, 250 or 500 nucleotidesin length and sufficient to specifically hybridize under stringentconditions to mRNA or DNA.

Fragments of the polynucleotides described herein are also useful tosynthesize larger fragments or full-length polynucleotides describedherein. For example, a fragment can be hybridized to any portion of anmRNA and a larger or full-length cDNA can be produced.

The fragments are also useful to synthesize antisense molecules ofdesired length and sequence.

Antisense nucleic acids of the invention can be designed using thenucleotide sequences of SEQ ID NOS:6, 8, 10, 12, and 14, and constructedusing chemical synthesis and enzymatic ligation reactions usingprocedures known in the art. For example, an antisense nucleic acid(e.g., an antisense oligonucleotide) can be chemically synthesized usingnaturally occurring nucleotides or variously modified nucleotidesdesigned to increase the biological stability of the molecules or toincrease the physical stability of the duplex formed between theantisense and sense nucleic acids, e.g., phosphorothioate derivativesand acridine substituted nucleotides can be used. Examples of modifiednucleotides which can be used to generate the antisense nucleic acidinclude 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil,hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl)uracil, 5-carboxymethylaminomethyl-2-thiouridine,5-carboxymethylaminomethyluracil, dihydrouracil,beta-D-galactosylqueosine, inosine, N6-isopentenyladenine,1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine,2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine,7-methylguanine, 5-methylaminomethyluracil,5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine,5′-methoxycarboxymethyluracil, 5-methoxyuracil,2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v),wybutoxosine, pseudouracil, queosine, 2-thiocytosine,5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil,uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v),5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w,and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can beproduced biologically using an expression vector into which a nucleicacid has been subcloned in an antisense orientation (i.e., RNAtranscribed from the inserted nucleic acid will be of an antisenseorientation to a target nucleic acid of interest).

Additionally, the nucleic acid molecules of the invention can bemodified at the base moiety, sugar moiety or phosphate backbone toimprove, e.g., the stability, hybridization, or solubility of themolecule. For example, the deoxyribose phosphate backbone of the nucleicacids can be modified to generate peptide nucleic acids (see Hyrup etal. (1996) Bioorganic & Medicinal Chemistry 4:5). As used herein, theterms “peptide nucleic acids” or “PNAs” refer to nucleic acid mimics,e.g., DNA mimics, in which the deoxyribose phosphate backbone isreplaced by a pseudopeptide backbone and only the four naturalnucleobases are retained. The neutral backbone of PNAs has been shown toallow for specific hybridization to DNA and RNA under conditions of lowionic strength. The synthesis of PNA oligomers can be performed usingstandard solid phase peptide synthesis protocols as described in Hyrupet al. (1996), supra; Perry-O'Keefe et al. (1996) Proc. Natl. Acad. Sci.USA 93:14670. PNAs can be further modified, e.g., to enhance theirstability, specificity or cellular uptake, by attaching lipophilic orother helper groups to PNA, by the formation of PNA-DNA chimeras, or bythe use of liposomes or other techniques of drug delivery known in theart. The synthesis of PNA-DNA chimeras can be performed as described inHyrup (1996), supra, Finn et al. (1996) Nucleic Acids Res.24(17):3357-63, Mag et al. (1989) Nucleic Acids Res. 17:5973, andPeterser et al. (1975) Bioorganic Med. Chem. Lett. 5:1119.

The nucleic acid molecules and fragments of the invention can alsoinclude other appended groups such as peptides (e.g., for targeting hostcell ADHs in vivo), or agents facilitating transport across the cellmembrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA84:648-652; PCT Publication No. WO 88/0918) or the blood brain barrier(see, e.g., PCT Publication No. WO 89/10134). In addition,oligonucleotides can be modified with hybridization-triggered cleavageagents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) orintercalating agents (see, e.g., Zon (1988) Pharm Res. 5:539-549).

The ADH polynucleotides are also useful as primers for PCR to amplifyany given region of an ADH polynucleotide.

The ADH polynucleotides are also useful for constructing recombinantvectors. Such vectors include expression vectors that express a portionof, or all of, the ADH polypeptides. Vectors also include insertionvectors, used to integrate into another polynucleotide sequence, such asinto the cellular genome, to alter in situ expression of ADH genes andgene products. For example, an endogenous ADH coding sequence can bereplaced via homologous recombination with all or part of the codingregion containing one or more specifically introduced mutations.

The ADH polynucleotides are also useful for expressing antigenicportions of the ADH proteins.

The ADH polynucleotides are also useful as probes for determining thechromosomal positions of the ADH polynucleotides by means of in situhybridization methods, such as FISH. (For a review of this technique,see Verma et al. (1988) Human Chromosomes: A Manual of Basic Techniques(Pergamon Press, New York), and PCR mapping of somatic cell hybrids. Themapping of the sequences to chromosomes is an important first step incorrelating these sequences with genes associated with disease.

Reagents for chromosome mapping can be used individually to mark asingle chromosome or a single site on that chromosome, or panels ofreagents can be used for marking multiple sites and/or multiplechromosomes. Reagents corresponding to noncoding regions of the genesactually are preferred for mapping purposes. Coding sequences are morelikely to be conserved within gene families, thus increasing the chanceof cross hybridizations during chromosomal mapping.

Once a sequence has been mapped to a precise chromosomal location, thephysical position of the sequence on the chromosome can be correlatedwith genetic map data. (Such data are found, for example, in V.McKusick, Mendelian Inheritance in Man, available on-line through JohnsHopkins University Welch Medical Library). The relationship between agene and a disease mapped to the same chromosomal region, can then beidentified through linkage analysis (co-inheritance of physicallyadjacent genes), described in, for example, Egeland et al. ((1987)Nature 325:783-787).

Moreover, differences in the DNA sequences between individuals affectedand unaffected with a disease associated with a specified gene, can bedetermined. If a mutation is observed in some or all of the affectedindividuals but not in any unaffected individuals, then the mutation islikely to be the causative agent of the particular disease. Comparisonof affected and unaffected individuals generally involves first lookingfor structural alterations in the chromosomes, such as deletions ortranslocations, that are visible from chromosome spreads, or detectableusing PCR based on that DNA sequence. Ultimately, complete sequencing ofgenes from several individuals can be performed to confirm the presenceof a mutation and to distinguish mutations from polymorphisms.

The ADH polynucleotide probes are also useful to determine patterns ofthe presence of the gene encoding the ADHs and their variants withrespect to tissue distribution, for example, whether gene duplicationhas occurred and whether the duplication occurs in all or only a subsetof tissues. The genes can be naturally occurring or can have beenintroduced into a cell, tissue, or organism exogenously.

The ADH polynucleotides are also useful for designing ribozymescorresponding to all, or a part, of the mRNA produced from genesencoding the polynucleotides described herein.

The ADH polynucleotides are also useful for constructing host cellsexpressing a part, or all, of the ADH polynucleotides and polypeptides.

The ADH polynucleotides are also useful for constructing transgenicanimals expressing all, or a part, of the ADH polynucleotides andpolypeptides.

The ADH polynucleotides are also useful for making vectors that expresspart, or all, of the ADH polypeptides.

The ADH polynucleotides are also useful as hybridization probes fordetermining the level of ADH nucleic acid expression. Accordingly, theprobes can be used to detect the presence of, or to determine levels of,ADH nucleic acid in cells, tissues, and in organisms. The nucleic acidwhose level is determined can be DNA or RNA. Accordingly, probescorresponding to the polypeptides described herein can be used to assessgene copy number in a given cell, tissue, or organism. This isparticularly relevant in cases in which there has been an amplificationof the ADH genes.

Alternatively, the probe can be used in an in situ hybridization contextto assess the position of extra copies of the ADH genes, as onextrachromosomal elements or as integrated into chromosomes in which theADH gene is not normally found, for example as a homogeneously stainingregion.

These uses are relevant for diagnosis of disorders involving an increaseor decrease in ADH expression relative to normal, such as aproliferative disorder or a differentiative or developmental disorder.

Tissues and/or cells in which the 21620 ADH is expressed are shown inFIGS. 11 and 12 and are described above herein. As such, the gene isparticularly relevant for the treatment of disorders involving thesetissues.

Furthermore, disorders in which ADH expression is relevant are disclosedherein above.

Thus, the present invention provides a method for identifying a diseaseor disorder associated with aberrant expression or activity of ADHnucleic acid, in which a test sample is obtained from a subject andnucleic acid (e.g., mRNA, genomic DNA) is detected, wherein the presenceof the nucleic acid is diagnostic for a subject having or at risk ofdeveloping a disease or disorder associated with aberrant expression oractivity of the nucleic acid.

One aspect of the invention relates to diagnostic assays for determiningnucleic acid expression as well as activity in the context of abiological sample (e.g., blood, serum, cells, tissue) to determinewhether an individual has a disease or disorder, or is at risk ofdeveloping a disease or disorder, associated with aberrant nucleic acidexpression or activity. Such assays can be used for prognostic orpredictive purpose to thereby prophylactically treat an individual priorto the onset of a disorder characterized by or associated withexpression or activity of the nucleic acid molecules.

In vitro techniques for detection of mRNA include Northernhybridizations and in situ hybridizations. In vitro techniques fordetecting DNA includes Southern hybridizations and in situhybridization.

Probes can be used as a part of a diagnostic test kit for identifyingcells or tissues that express the ADH, such as by measuring the level ofan ADH-encoding nucleic acid in a sample of cells from a subject e.g.,mRNA or genomic DNA, or determining if the ADH gene has been mutated.

Nucleic acid expression assays are useful for drug screening to identifycompounds that modulate ADH nucleic acid expression (e.g., antisense,polypeptides, peptidomimetics, small molecules or other drugs). A cellis contacted with a candidate compound and the expression of mRNAdetermined. The level of expression of the mRNA in the presence of thecandidate compound is compared to the level of expression of the mRNA inthe absence of the candidate compound. The candidate compound can thenbe identified as a modulator of nucleic acid expression based on thiscomparison and be used, for example to treat a disorder characterized byaberrant nucleic acid expression. The modulator can bind to the nucleicacid or indirectly modulate expression, such as by interacting withother cellular components that affect nucleic acid expression.

Modulatory methods can be performed in vitro (e.g., by culturing thecell with the agent) or, alternatively, in vivo (e.g., by administeringthe gent to a subject) in patients or in transgenic animals.

The invention thus provides a method for identifying a compound that canbe used to treat a disorder associated with nucleic acid expression ofthe ADH gene. The method typically includes assaying the ability of thecompound to modulate the expression of the ADH nucleic acid and thusidentifying a compound that can be used to treat a disordercharacterized by undesired ADH nucleic acid expression.

The assays can be performed in cell-based and cell-free systems.Cell-based assays include cells naturally expressing the ADH nucleicacid or recombinant cells genetically engineered to express specificnucleic acid sequences.

Alternatively, candidate compounds can be assayed in vivo in patients orin transgenic animals.

The assay for ADH nucleic acid expression can involve direct assay ofnucleic acid levels, such as mRNA levels, or on collateral compoundsinvolved in the ADH catalized reaction (such as oxidized/reducedproducts, NAD⁺/NADH ratio, or components of the retinoic and signalingpathway). Further, the expression of genes that are up- ordown-regulated in response to the ADH signal pathway can also beassayed. In this embodiment the regulatory regions of these genes can beoperably linked to a reporter gene such as luciferase.

Thus, modulators of ADH gene expression can be identified in a methodwherein a cell is contacted with a candidate compound and the expressionof mRNA determined. The level of expression of ADH mRNA in the presenceof the candidate compound is compared to the level of expression of ADHmRNA in the absence of the candidate compound. The candidate compoundcan then be identified as a modulator of nucleic acid expression basedon this comparison and be used, for example to treat a disordercharacterized by aberrant nucleic acid expression. When expression ofmRNA is statistically significantly greater in the presence of thecandidate compound than in its absence, the candidate compound isidentified as a stimulator of nucleic acid expression. When nucleic acidexpression is statistically significantly less in the presence of thecandidate compound than in its absence, the candidate compound isidentified as an inhibitor of nucleic acid expression.

Accordingly, the invention provides methods of treatment, with thenucleic acid as a target, using a compound identified through drugscreening as a gene modulator to modulate ADH nucleic acid expression.Modulation includes both up-regulation (i.e. activation or agonization)or down-regulation (suppression or antagonization) or effects on nucleicacid activity (e.g. when nucleic acid is mutated or improperlymodified). Treatment is of disorders characterized by aberrantexpression or activity of the nucleic acid. Disorders that the gene isparticularly relevant for treating have been disclosed herein above.

Alternatively, a modulator for ADH nucleic acid expression can be asmall molecule or drug identified using the screening assays describedherein as long as the drug or small molecule inhibits the ADH nucleicacid expression.

The ADH polynucleotides are also useful for monitoring the effectivenessof modulating compounds on the expression or activity of the ADH gene inclinical trials or in a treatment regimen. Thus, the gene expressionpattern can serve as a barometer for the continuing effectiveness oftreatment with the compound, particularly with compounds to which apatient can develop resistance. The gene expression pattern can alsoserve as a marker indicative of a physiological response of the affectedcells to the compound. Accordingly, such monitoring would allow eitherincreased administration of the compound or the administration ofalternative compounds to which the patient has not become resistant.Similarly, if the level of nucleic acid expression falls below adesirable level, administration of the compound could be commensuratelydecreased.

Monitoring can be, for example, as follows: (i) obtaining apre-administration sample from a subject prior to administration of theagent; (ii) detecting the level of expression of a specified mRNA orgenomic DNA of the invention in the pre-administration sample; (iii)obtaining one or more post-administration samples from the subject; (iv)detecting the level of expression or activity of the mRNA or genomic DNAin the post-administration samples; (v) comparing the level ofexpression or activity of the mRNA or genomic DNA in thepre-administration sample with the mRNA or genomic DNA in thepost-administration sample or samples; and (vi) increasing or decreasingthe administration of the agent to the subject accordingly.

The ADH polynucleotides are also useful in diagnostic assays forqualitative changes in ADH nucleic acid, and particularly in qualitativechanges that lead to pathology. The polynucleotides can be used todetect mutations in ADH genes and gene expression products such as mRNA.The polynucleotides can be used as hybridization probes to detectnaturally-occurring genetic mutations in the ADH gene and thereby todetermine whether a subject with the mutation is at risk for a disordercaused by the mutation. Mutations include deletion, addition, orsubstitution of one or more nucleotides in the gene, chromosomalrearrangement, such as inversion or transposition, modification ofgenomic DNA, such as aberrant methylation patterns or changes in genecopy number, such as amplification. Detection of a mutated form of theADH gene associated with a dysfunction provides a diagnostic tool for anactive disease or susceptibility to disease when the disease resultsfrom overexpression, underexpression, or altered expression of an ADH.

Mutations in the ADH gene can be detected at the nucleic acid level by avariety of techniques. Genomic DNA can be analyzed directly or can beamplified by using PCR prior to analysis. RNA or cDNA can be used in thesame way.

In certain embodiments, detection of the mutation involves the use of aprobe/primer in a polymerase chain reaction (PCR) (see, e.g., U.S. Pat.Nos. 4,683,195 and 4,683,202), such as anchor PCR or RACE PCR, or,alternatively, in a ligation chain reaction (LCR) (see, e.g., Landegranet al. (1988) Science 241:1077-1080; and Nakazawa et al. (1994) PNAS91:360-364), the latter of which can be particularly useful fordetecting point mutations in the gene (see Abravaya et al. (1995)Nucleic Acids Res. 23:675-682). This method can include the steps ofcollecting a sample of cells from a patient, isolating nucleic acid(e.g., genomic, mRNA or both) from the cells of the sample, contactingthe nucleic acid sample with one or more primers which specificallyhybridize to a gene under conditions such that hybridization andamplification of the gene (if present) occurs, and detecting thepresence or absence of an amplification product, or detecting the sizeof the amplification product and comparing the length to a controlsample. Deletions and insertions can be detected by a change in size ofthe amplified product compared to the normal genotype. Point mutationscan be identified by hybridizing amplified DNA to normal RNA orantisense DNA sequences.

It is anticipated that PCR and/or LCR may be desirable to use as apreliminary amplification step in conjunction with any of the techniquesused for detecting mutations described herein.

Alternative amplification methods include: self sustained sequencereplication (Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA87:1874-1878), transcriptional amplification system (Kwoh et al. (1989)Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi etal. (1988) Bio/Technology 6:1197), or any other nucleic acidamplification method, followed by the detection of the amplifiedmolecules using techniques well-known to those of skill in the art.These detection schemes are especially useful for the detection ofnucleic acid molecules if such molecules are present in very lownumbers.

Alternatively, mutations in an ADH gene can be directly identified, forexample, by alterations in restriction enzyme digestion patternsdetermined by gel electrophoresis.

Further, sequence-specific ribozymes (U.S. Pat. No. 5,498,531) can beused to score for the presence of specific mutations by development orloss of a ribozyme cleavage site.

Perfectly matched sequences can be distinguished from mismatchedsequences by nuclease cleavage digestion assays or by differences inmelting temperature.

Sequence changes at specific locations can also be assessed by nucleaseprotection assays such as RNase and S1 protection or the chemicalcleavage method.

Furthermore, sequence differences between a mutant ADH gene and awild-type gene can be determined by direct DNA sequencing. A variety ofautomated sequencing procedures can be utilized when performing thediagnostic assays ((1995) Biotechniques 19:448), including sequencing bymass spectrometry (see, e.g., PCT International Publication No. WO94/16101; Cohen et al. (1996) Adv. Chromatogr. 36:127-162; and Griffinet al. (1993) Appl. Biochem. Biotechnol. 38:147-159).

Other methods for detecting mutations in the gene include methods inwhich protection from cleavage agents is used to detect mismatched basesin RNA/RNA or RNA/DNA duplexes (Myers et al. (1985) Science 230:1242);Cotton et al. (1988) PNAS 85:4397; Saleeba et al. (1992) Meth. Enzymol.217:286-295), electrophoretic mobility of mutant and wild type nucleicacid is compared (Orita et al. (1989) PNAS 86:2766; Cotton et al. (1993)Mutat. Res. 285:125-144; and Hayashi et al. (1992) Genet. Anal. Tech.Appl. 9:73-79), and movement of mutant or wild-type fragments inpolyacrylamide gels containing a gradient of denaturant is assayed usingdenaturing gradient gel electrophoresis (Myers et al. (1985) Nature313:495). The sensitivity of the assay may be enhanced by using RNA(rather than DNA), in which the secondary structure is more sensitive toa change in sequence. In one embodiment, the subject method utilizesheteroduplex analysis to separate double stranded heteroduplex moleculeson the basis of changes in electrophoretic mobility (Keen et al. (1991)Trends Genet. 7:5). Examples of other techniques for detecting pointmutations include, selective oligonucleotide hybridization, selectiveamplification, and selective primer extension.

In other embodiments, genetic mutations can be identified by hybridizinga sample and control nucleic acids, e.g., DNA or RNA, to high densityarrays containing hundreds or thousands of oligonucleotide probes(Cronin et al. (1996) Human Mutation 7:244-255; Kozal et al. (1996)Nature Medicine 2:753-759). For example, genetic mutations can beidentified in two dimensional arrays containing light-generated DNAprobes as described in Cronin et al. supra. Briefly, a firsthybridization array of probes can be used to scan through long stretchesof DNA in a sample and control to identify base changes between thesequences by making linear arrays of sequential overlapping probes. Thisstep allows the identification of point mutations. This step is followedby a second hybridization array that allows the characterization ofspecific mutations by using smaller, specialized probe arrayscomplementary to all variants or mutations detected. Each mutation arrayis composed of parallel probe sets, one complementary to the wild-typegene and the other complementary to the mutant gene.

The ADH polynucleotides are also useful for testing an individual for agenotype that while not necessarily causing the disease, neverthelessaffects the treatment modality. Thus, the polynucleotides can be used tostudy the relationship between an individual's genotype and theindividual's response to a compound used for treatment (pharmacogenomicrelationship). In the present case, for example, a mutation in the ADHgene that results in altered affinity for a coenzyme could result in anexcessive or decreased drug effect with standard concentrations of thecoenzyme that activates the ADH. Accordingly, the ADH polynucleotidesdescribed herein can be used to assess the mutation content of the genein an individual in order to select an appropriate compound or dosageregimen for treatment.

Thus polynucleotides displaying genetic variations that affect treatmentprovide a diagnostic target that can be used to tailor treatment in anindividual. Accordingly, the production of recombinant cells and animalscontaining these polymorphisms allow effective clinical design oftreatment compounds and dosage regimens.

The methods can involve obtaining a control biological sample from acontrol subject, contacting the control sample with a compound or agentcapable of detecting mRNA, or genomic DNA, such that the presence ofmRNA or genomic DNA is detected in the biological sample, and comparingthe presence of mRNA or genomic DNA in the control sample with thepresence of mRNA or genomic DNA in the test sample.

The ADH polynucleotides are also useful for chromosome identificationwhen the sequence is identified with an individual chromosome and to aparticular location on the chromosome. First, the DNA sequence ismatched to the chromosome by in situ or other chromosome-specifichybridization. Sequences can also be correlated to specific chromosomesby preparing PCR primers that can be used for PCR screening of somaticcell hybrids containing individual chromosomes from the desired species.Only hybrids containing the chromosome containing the gene homologous tothe primer will yield an amplified fragment. Sublocalization can beachieved using chromosomal fragments. Other strategies includeprescreening with labeled flow-sorted chromosomes and preselection byhybridization to chromosome-specific libraries. Further mappingstrategies include fluorescence in situ hybridization, which allowshybridization with probes shorter than those traditionally used.Reagents for chromosome mapping can be used individually to mark asingle chromosome or a single site on the chromosome, or panels ofreagents can be used for marking multiple sites and/or multiplechromosomes. Reagents corresponding to noncoding regions of the genesactually are preferred for mapping purposes. Coding sequences are morelikely to be conserved within gene families, thus increasing the chanceof cross hybridizations during chromosomal mapping.

The ADH polynucleotides can also be used to identify individuals fromsmall biological samples. This can be done for example using restrictionfragment-length polymorphism (RFLP) to identify an individual. Thus, thepolynucleotides described herein are useful as DNA markers for RFLP (SeeU.S. Pat. No. 5,272,057).

Furthermore, the ADH sequence can be used to provide an alternativetechnique, which determines the actual DNA sequence of selectedfragments in the genome of an individual. Thus, the ADH sequencesdescribed herein can be used to prepare two PCR primers from the 5′ and3′ ends of the sequences. These primers can then be used to amplify DNAfrom an individual for subsequent sequencing.

Panels of corresponding DNA sequences from individuals prepared in thismanner can provide unique individual identifications, as each individualwill have a unique set of such DNA sequences. It is estimated thatallelic variation in humans occurs with a frequency of about once pereach 500 bases. Allelic variation occurs to some degree in the codingregions of these sequences, and to a greater degree in the noncodingregions. The ADH sequences can be used to obtain such identificationsequences from individuals and from tissue. The sequences representunique fragments of the human genome. Each of the sequences describedherein can, to some degree, be used as a standard against which DNA froman individual can be compared for identification purposes.

If a panel of reagents from the sequences is used to generate a uniqueidentification database for an individual, those same reagents can laterbe used to identify tissue from that individual. Using the uniqueidentification database, positive identification of the individual,living or dead, can be made from extremely small tissue samples.

The ADH polynucleotides can also be used in forensic identificationprocedures. PCR technology can be used to amplify DNA sequences takenfrom very small biological samples, such as a single hair follicle, bodyfluids (e.g. blood, saliva, or semen). The amplified sequence can thenbe compared to a standard allowing identification of the origin of thesample.

The ADH polynucleotides can thus be used to provide polynucleotidereagents, e.g., PCR primers, targeted to specific loci in the humangenome, which can enhance the reliability of DNA-based forensicidentifications by, for example, providing another “identificationmarker” (i.e. another DNA sequence that is unique to a particularindividual). As described above, actual base sequence information can beused for identification as an accurate alternative to patterns formed byrestriction enzyme generated fragments. Sequences targeted to thenoncoding region are particularly useful since greater polymorphismoccurs in the noncoding regions, making it easier to differentiateindividuals using this technique.

The ADH polynucleotides can further be used to provide polynucleotidereagents, e.g., labeled or labelable probes which can be used in, forexample, an in situ hybridization technique, to identify a specifictissue. This is useful in cases in which a forensic pathologist ispresented with a tissue of unknown origin. Panels of ADH probes can beused to identify tissue by species and/or by organ type.

In a similar fashion, these primers and probes can be used to screentissue culture for contamination (i.e. screen for the presence of amixture of different types of cells in a culture).

Alternatively, the ADH polynucleotides can be used directly to blocktranscription or translation of ADH gene sequences by means of antisenseor ribozyme constructs. Thus, in a disorder characterized by abnormallyhigh or undesirable ADH gene expression, nucleic acids can be directlyused for treatment.

The ADH polynucleotides are thus useful as antisense constructs tocontrol ADH gene expression in cells, tissues, and organisms. A DNAantisense polynucleotide is designed to be complementary to a region ofthe gene involved in transcription, preventing transcription and henceproduction of ADH protein. An antisense RNA or DNA polynucleotide wouldhybridize to the mRNA and thus block translation of mRNA into ADHprotein.

Examples of antisense molecules useful to inhibit nucleic acidexpression include antisense molecules complementary to a fragment ofthe 5′ untranslated region of SEQ ID NOS:6, 8, 10, 12, and 14, whichalso includes the start codon and antisense molecules which arecomplementary to a fragment of the 3′ untranslated region of SEQ IDNOS:6, 8, 10, 12, and 14.

Alternatively, a class of antisense molecules can be used to inactivatemRNA in order to decrease expression of an ADH nucleic acid.Accordingly, these molecules can treat a disorder characterized byabnormal or undesired ADH nucleic acid expression. This techniqueinvolves cleavage by means of ribozymes containing nucleotide sequencescomplementary to one or more regions in the mRNA that attenuate theability of the mRNA to be translated. Possible regions include codingregions and particularly coding regions corresponding to the catalyticand other functional activities of the ADH protein.

The ADH polynucleotides also provide vectors for gene therapy inpatients containing cells that are aberrant in ADH gene expression.Thus, recombinant cells, which include the patient's cells that havebeen engineered ex vivo and returned to the patient, are introduced intoan individual where the cells produce the desired ADH protein to treatthe individual.

The invention also encompasses kits for detecting the presence of an ADHnucleic acid in a biological sample. For example, the kit can comprisereagents such as a labeled or labelable nucleic acid or agent capable ofdetecting ADH nucleic acid in a biological sample; means for determiningthe amount of ADH nucleic acid in the sample; and means for comparingthe amount of ADH nucleic acid in the sample with a standard. Thecompound or agent can be packaged in a suitable container. The kit canfurther comprise instructions for using the kit to detect ADH mRNA orDNA.

Computer Readable Means

The nucleotide or amino acid sequences of the invention are alsoprovided in a variety of mediums to facilitate use thereof. As usedherein, “provided” refers to a manufacture, other than an isolatednucleic acid or amino acid molecule, which contains a nucleotide oramino acid sequence of the present invention. Such a manufactureprovides the nucleotide or amino acid sequences, or a subset thereof(e.g., a subset of open reading frames (ORFs)) in a form which allows askilled artisan to examine the manufacture using means not directlyapplicable to examining the nucleotide or amino acid sequences, or asubset thereof, as they exists in nature or in purified form.

In one application of this embodiment, a nucleotide or amino acidsequence of the present invention can be recorded on computer readablemedia. As used herein, “computer readable media” refers to any mediumthat can be read and accessed directly by a computer. Such mediainclude, but are not limited to: magnetic storage media, such as floppydiscs, hard disc storage medium, and magnetic tape; optical storagemedia such as CD-ROM; electrical storage media such as RAM and ROM; andhybrids of these categories such as magnetic/optical storage media. Theskilled artisan will readily appreciate how any of the presently knowncomputer readable mediums can be used to create a manufacture comprisingcomputer readable medium having recorded thereon a nucleotide or aminoacid sequence of the present invention.

As used herein, “recorded” refers to a process for storing informationon computer readable medium. The skilled artisan can readily adopt anyof the presently known methods for recording information on computerreadable medium to generate manufactures comprising the nucleotide oramino acid sequence information of the present invention.

A variety of data storage structures are available to a skilled artisanfor creating a computer readable medium having recorded thereon anucleotide or amino acid sequence of the present invention. The choiceof the data storage structure will generally be based on the meanschosen to access the stored information. In addition, a variety of dataprocessor programs and formats can be used to store the nucleotidesequence information of the present invention on computer readablemedium. The sequence information can be represented in a word processingtext file, formatted in commercially-available software such asWordPerfect and Microsoft Word, or represented in the form of an ASCIIfile, stored in a database application, such as DB2, Sybase, Oracle, orthe like. The skilled artisan can readily adapt any number ofdataprocessor structuring formats (e.g., text file or database) in orderto obtain computer readable medium having recorded thereon thenucleotide sequence information of the present invention.

By providing the nucleotide or amino acid sequences of the invention incomputer readable form, the skilled artisan can routinely access thesequence information for a variety of purposes. For example, one skilledin the art can use the nucleotide or amino acid sequences of theinvention in computer readable form to compare a target sequence ortarget structural motif with the sequence information stored within thedata storage means. Search means are used to identify fragments orregions of the sequences of the invention which match a particulartarget sequence or target motif.

As used herein, a “target sequence” can be any DNA or amino acidsequence of six or more nucleotides or two or more amino acids. Askilled artisan can readily recognize that the longer a target sequenceis, the less likely a target sequence will be present as a randomoccurrence in the database. The most preferred sequence length of atarget sequence is from about 10 to 100 amino acids or from about 30 to300 nucleotide residues. However, it is well recognized thatcommercially important fragments, such as sequence fragments involved ingene expression and protein processing, may be of shorter length.

As used herein, “a target structural motif,” or “target motif,” refersto any rationally selected sequence or combination of sequences in whichthe sequence(s) are chosen based on a three-dimensional configurationwhich is formed upon the folding of the target motif. There are avariety of target motifs known in the art. Protein target motifsinclude, but are not limited to, enzyme active sites and signalsequences. Nucleic acid target motifs include, but are not limited to,promoter sequences, hairpin structures and inducible expression elements(protein binding sequences).

Computer software is publicly available which allows a skilled artisanto access sequence information provided in a computer readable mediumfor analysis and comparison to other sequences. A variety of knownalgorithms are disclosed publicly and a variety of commerciallyavailable software for conducting search means are and can be used inthe computer-based systems of the present invention. Examples of suchsoftware includes, but is not limited to, MacPattern (EMBL), BLASTN andBLASTX (NCBIA).

For example, software which implements the BLAST (Altschul et al. (1990)J. Mol. Biol. 215:403-410) and BLAZE (Brutlag et al. (1993) Comp. Chem.17:203-207) search algorithms on a Sybase system can be used to identifyopen reading frames (ORFs) of the sequences of the invention whichcontain homology to ORFs or proteins from other libraries. Such ORFs areprotein encoding fragments and are useful in producing commerciallyimportant proteins such as enzymes used in various reactions and in theproduction of commercially useful metabolites.

Vectors/Host Cells

The invention also provides vectors containing the ADH polynucleotides.The term “vector” refers to a vehicle, preferably a nucleic acidmolecule that can transport the ADH polynucleotides. When the vector isa nucleic acid molecule, the ADH polynucleotides are covalently linkedto the vector nucleic acid. With this aspect of the invention, thevector includes a plasmid, single or double stranded phage, a single ordouble stranded RNA or DNA viral vector, or artificial chromosome, suchas a BAC, PAC, YAC, OR MAC.

A vector can be maintained in the host cell as an extrachromosomalelement where it replicates and produces additional copies of the ADHpolynucleotides. Alternatively, the vector may integrate into the hostcell genome and produce additional copies of the ADH polynucleotideswhen the host cell replicates.

The invention provides vectors for the maintenance (cloning vectors) orvectors for expression (expression vectors) of the ADH polynucleotides.The vectors can function in procaryotic or eukaryotic cells or in both(shuttle vectors).

Expression vectors contain cis-acting regulatory regions that areoperably linked in the vector to the ADH polynucleotides such thattranscription of the polynucleotides is allowed in a host cell. Thepolynucleotides can be introduced into the host cell with a separatepolynucleotide capable of affecting transcription. Thus, the secondpolynucleotide may provide a trans-acting factor interacting with thecis-regulatory control region to allow transcription of the ADHpolynucleotides from the vector. Alternatively, a trans-acting factormay be supplied by the host cell. Finally, a trans-acting factor can beproduced from the vector itself.

It is understood, however, that in some embodiments, transcriptionand/or translation of the ADH polynucleotides can occur in a cell-freesystem.

The regulatory sequence to which the polynucleotides described hereincan be operably linked include promoters for directing mRNAtranscription. These include, but are not limited to, the left promoterfrom bacteriophage λ, the lac, TRP, and TAC promoters from E. coli, theearly and late promoters from SV40, the CMV immediate early promoter,the adenovirus early and late promoters, and retrovirus long-terminalrepeats.

In addition to control regions that promote transcription, expressionvectors may also include regions that modulate transcription, such asrepressor binding sites and enhancers. Examples include the SV40enhancer, the cytomegalovirus immediate early enhancer, polyomaenhancer, adenovirus enhancers, and retrovirus LTR enhancers.

In addition to containing sites for transcription initiation andcontrol, expression vectors can also contain sequences necessary fortranscription termination and, in the transcribed region a ribosomebinding site for translation. Other regulatory control elements forexpression include initiation and termination codons as well aspolyadenylation signals. The person of ordinary skill in the art wouldbe aware of the numerous regulatory sequences that are useful inexpression vectors. Such regulatory sequences are described, forexample, in Sambrook et al. (1989) Molecular Cloning: A LaboratoryManual 2nd. ed., Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y.).

A variety of expression vectors can be used to express an ADHpolynucleotide. Such vectors include chromosomal, episomal, andvirus-derived vectors, for example vectors derived from bacterialplasmids, from bacteriophage, from yeast episomes, from yeastchromosomal elements, including yeast artificial chromosomes, fromviruses such as baculoviruses, papovaviruses such as SV40, Vacciniaviruses, adenoviruses, poxviruses, pseudorabies viruses, andretroviruses. Vectors may also be derived from combinations of thesesources such as those derived from plasmid and bacteriophage geneticelements, e.g. cosmids and phagemids. Appropriate cloning and expressionvectors for prokaryotic and eukaryotic hosts are described in Sambrooket al. (1989) Molecular Cloning: A Laboratory Manual 2nd. ed., ColdSpring Harbor Laboratory Press, Cold Spring Harbor, N.Y.

The regulatory sequence may provide constitutive expression in one ormore host cells (i.e., tissue specific) or may provide for inducibleexpression in one or more cell types such as by temperature, nutrientadditive, or exogenous factor such as a hormone or other ligand. Avariety of vectors providing for constitutive and inducible expressionin prokaryotic and eukaryotic hosts are well known to those of ordinaryskill in the art.

The ADH polynucleotides can be inserted into the vector nucleic acid bywell-known methodology. Generally, the DNA sequence that will ultimatelybe expressed is joined to an expression vector by cleaving the DNAsequence and the expression vector with one or more restriction enzymesand then ligating the fragments together. Procedures for restrictionenzyme digestion and ligation are well known to those of ordinary skillin the art.

The vector containing the appropriate polynucleotide can be introducedinto an appropriate host cell for propagation or expression usingwell-known techniques. Bacterial cells include, but are not limited to,E. coli, Streptomyces, and Salmonella typhimurium. Eukaryotic cellsinclude, but are not limited to, yeast, insect cells such as Drosophila,animal cells such as COS and CHO cells, and plant cells.

As described herein, it may be desirable to express the polypeptide as afusion protein. Accordingly, the invention provides fusion vectors thatallow for the production of the ADH polypeptides. Fusion vectors canincrease the expression of a recombinant protein, increase thesolubility of the recombinant protein, and aid in the purification ofthe protein by acting for example as a ligand for affinity purification.A proteolytic cleavage site may be introduced at the junction of thefusion moiety so that the desired polypeptide can ultimately beseparated from the fusion moiety. Proteolytic enzymes include, but arenot limited to, factor Xa, thrombin, and enterokinase. Typical fusionexpression vectors include pGEX (Smith et al. (1988) Gene 67:31-40),pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia,Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose Ebinding protein, or protein A, respectively, to the target recombinantprotein. Examples of suitable inducible non-fusion E. coli expressionvectors include pTrc (Amann et al. (1988) Gene 69:301-315) and pET 11d(Studier et al. (1990) Gene Expression Technology: Methods in Enzymology185:60-89).

Recombinant protein expression can be maximized in a host bacteria byproviding a genetic background wherein the host cell has an impairedcapacity to proteolytically cleave the recombinant protein. (Gottesman,S. (1990) Gene Expression Technology: Methods in Enzymology 185,Academic Press, San Diego, Calif. 119-128). Alternatively, the sequenceof the polynucleotide of interest can be altered to provide preferentialcodon usage for a specific host cell, for example E. coli. (Wada et al.(1992) Nucleic Acids Res. 20:2111-2118).

The ADH polynucleotides can also be expressed by expression vectors thatare operative in yeast. Examples of vectors for expression in yeaste.g., S. cerevisiae include pYepSec1 (Baldari et al. (1987) EMBO J.6:229-234), pMFa (Kurjan et al. (1982) Cell 30:933-943), pJRY88 (Schultzet al. (1987) Gene 54:113-123), and pYES2 (Invitrogen Corporation, SanDiego, Calif.).

The ADH polynucleotides can also be expressed in insect cells using, forexample, baculovirus expression vectors. Baculovirus vectors availablefor expression of proteins in cultured insect cells (e.g., Sf9 cells)include the pAc series (Smith et al. (1983) Mol. Cell. Biol.3:2156-2165) and the pVL series (Lucklow et al. (1989) Virology170:31-39).

In certain embodiments of the invention, the polynucleotides describedherein are expressed in mammalian cells using mammalian expressionvectors. Examples of mammalian expression vectors include pCDM8 (Seed,B. (1987) Nature 329:840) and pMT2PC (Kaufman et al. (1987) EMBO J.6:187-195).

The expression vectors listed herein are provided by way of example onlyof the well-known vectors available to those of ordinary skill in theart that would be useful to express the ADH polynucleotides. The personof ordinary skill in the art would be aware of other vectors suitablefor maintenance propagation or expression of the polynucleotidesdescribed herein. These are found for example in Sambrook et al. (1989)Molecular Cloning: A Laboratory Manual 2nd, ed., Cold Spring HarborLaboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor,N.Y.

The invention also encompasses vectors in which the nucleic acidsequences described herein are cloned into the vector in reverseorientation, but operably linked to a regulatory sequence that permitstranscription of antisense RNA. Thus, an antisense transcript can beproduced to all, or to a portion, of the polynucleotide sequencesdescribed herein, including both coding and non-coding regions.Expression of this antisense RNA is subject to each of the parametersdescribed above in relation to expression of the sense RNA (regulatorysequences, constitutive or inducible expression, tissue-specificexpression).

The invention also relates to recombinant host cells containing thevectors described herein. Host cells therefore include prokaryoticcells, lower eukaryotic cells such as yeast, other eukaryotic cells suchas insect cells, and higher eukaryotic cells such as mammalian cells.

The recombinant host cells are prepared by introducing the vectorconstructs described herein into the cells by techniques readilyavailable to the person of ordinary skill in the art. These include, butare not limited to, calcium phosphate transfection,DEAE-dextran-mediated transfection, cationic lipid-mediatedtransfection, electroporation, transduction, infection, lipofection, andother techniques such as those found in Sambrook et al. (MolecularCloning: A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory,Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.).

Host cells can contain more than one vector. Thus, different nucleotidesequences can be introduced on different vectors of the same cell.Similarly, the ADH polynucleotides can be introduced either alone orwith other polynucleotides that are not related to the ADHpolynucleotides such as those providing trans-acting factors forexpression vectors. When more than one vector is introduced into a cell,the vectors can be introduced independently, co-introduced or joined tothe ADH polynucleotide vector.

In the case of bacteriophage and viral vectors, these can be introducedinto cells as packaged or encapsulated virus by standard procedures forinfection and transduction. Viral vectors can be replication-competentor replication-defective. In the case in which viral replication isdefective, replication will occur in host cells providing functions thatcomplement the defects.

Vectors generally include selectable markers that enable the selectionof the subpopulation of cells that contain the recombinant vectorconstructs. The marker can be contained in the same vector that containsthe polynucleotides described herein or may be on a separate vector.Markers include tetracycline or ampicillin-resistance genes forprokaryotic host cells and dihydrofolate reductase or neomycinresistance for eukaryotic host cells. However, any marker that providesselection for a phenotypic trait will be effective.

While the mature proteins can be produced in bacteria, yeast, mammaliancells, and other cells under the control of the appropriate regulatorysequences, cell-free transcription and translation systems can also beused to produce these proteins using RNA derived from the DNA constructsdescribed herein.

Where secretion of the polypeptide is desired, appropriate secretionsignals are incorporated into the vector. The signal sequence can beendogenous to the ADH polypeptides or heterologous to thesepolypeptides.

Where the polypeptide is not secreted into the medium, the protein canbe isolated from the host cell by standard disruption procedures,including freeze thaw, sonication, mechanical disruption, use of lysingagents and the like. The polypeptide can then be recovered and purifiedby well-known purification methods including ammonium sulfateprecipitation, acid extraction, anion or cationic exchangechromatography, phosphocellulose chromatography, hydrophobic-interactionchromatography, affinity chromatography, hydroxylapatite chromatography,lectin chromatography, or high performance liquid chromatography.

It is also understood that depending upon the host cell in recombinantproduction of the polypeptides described herein, the polypeptides canhave various glycosylation patterns, depending upon the cell, or maybenon-glycosylated as when produced in bacteria. In addition, thepolypeptides may include an initial modified methionine in some cases asa result of a host-mediated process.

Uses of Vectors and Host Cells

It is understood that “host cells” and “recombinant host cells” refernot only to the particular subject cell but also to the progeny orpotential progeny of such a cell. Because certain modifications mayoccur in succeeding generations due to either mutation or environmentalinfluences, such progeny may not, in fact, be identical to the parentcell, but are still included within the scope of the term as usedherein.

The host cells expressing the polypeptides described herein, andparticularly recombinant host cells, have a variety of uses. First, thecells are useful for producing ADH proteins or polypeptides that can befurther purified to produce desired amounts of ADH protein or fragments.Thus, host cells containing expression vectors are useful forpolypeptide production.

Host cells are also useful for conducting cell-based assays involvingthe ADH or ADH fragments. Thus, a recombinant host cell expressing anative ADH is useful to assay for compounds that stimulate or inhibitADH function. This includes gene expression at the level oftranscription or translation, interactions with coenzymes, substrates orADH subunits, and catalysis of substrate oxidation/reduction.

Host cells are also useful for identifying ADH mutants in which thesefunctions are affected. If the mutants naturally occur and give rise toa pathology, host cells containing the mutations are useful to assaycompounds that have a desired effect on the mutant ADH (for example,stimulating or inhibiting function) which may not be indicated by theireffect on the native ADH.

Recombinant host cells are also useful for expressing the chimericpolypeptides described herein to assess compounds that activate orsuppress activation by means of a heterologous domain, segment, site,and the like, as disclosed herein.

Further, mutant ADHs can be designed in which one or more of the variousfunctions is engineered to be increased or decreased (e.g., coenzyme,substrate, or ADH subunits) and used to augment or replace ADH proteinsin an individual. Thus, host cells can provide a therapeutic benefit byreplacing an aberrant ADH or providing an aberrant ADH that provides atherapeutic result. In one embodiment, the cells provide ADHs that areabnormally active.

In another embodiment, the cells provide ADH that are abnormallyinactive. These ADHs can compete with endogenous ADHs in the individual.

In another embodiment, cells expressing ADHs that are not catalyticallyactive, are introduced into an individual in order to compete withendogenous ADHs for substrate, coenzymes or ADH subunits. For example,in the case in which excessive amounts of an ADH substrate is part of atreatment modality, it may be necessary to inactivate this molecule at aspecific point in treatment. Providing cells that compete for themolecule, but which cannot be affected by ADH activation would bebeneficial.

Homologously recombinant host cells can also be produced that allow thein situ alteration of endogenous ADH polynucleotide sequences in a hostcell genome. The host cell includes, but is not limited to, a stablecell line, cell in vivo, or cloned microorganism. This technology ismore fully described in WO 93/09222, WO 91/12650, WO 91/06667, U.S. Pat.No. 5,272,071, and U.S. Pat. No. 5,641,670. Briefly, specificpolynucleotide sequences corresponding to the ADH polynucleotides orsequences proximal or distal to an ADH gene are allowed to integrateinto a host cell genome by homologous recombination where expression ofthe gene can be affected. In one embodiment, regulatory sequences areintroduced that either increase or decrease expression of an endogenoussequence. Accordingly, an ADH protein can be produced in a cell notnormally producing it. Alternatively, increased expression of ADHprotein can be effected in a cell normally producing the protein at aspecific level. Further, expression can be decreased or eliminated byintroducing a specific regulatory sequence. The regulatory sequence canbe heterologous to the ADH protein sequence or can be a homologoussequence with a desired mutation that affects expression. Alternatively,the entire gene can be deleted. The regulatory sequence can be specificto the host cell or capable of functioning in more than one cell type.Still further, specific mutations can be introduced into any desiredregion of the gene to produce mutant ADH proteins. Such mutations couldbe introduced, for example, into the specific functional regions such asthe substrate-binding site.

In one embodiment, the host cell can be a fertilized oocyte or embryonicstem cell that can be used to produce a transgenic animal containing thealtered ADH gene. Alternatively, the host cell can be a stem cell orother early tissue precursor that gives rise to a specific subset ofcells and can be used to produce transgenic tissues in an animal. Seealso Thomas et al., Cell 51:503 (1987) for a description of homologousrecombination vectors. The vector is introduced into an embryonic stemcell line (e.g., by electroporation) and cells in which the introducedgene has homologously recombined with the endogenous ADH gene isselected (see e.g., Li, E. et al. (1992) Cell 69:915). The selectedcells are then injected into a blastocyst of an animal (e.g., a mouse)to form aggregation chimeras (see e.g., Bradley, A. in Teratocarcinomasand Embryonic Stem Cells: A Practical Approach, E. J. Robertson, ed.(IRL, Oxford, 1987) pp. 113-152). A chimeric embryo can then beimplanted into a suitable pseudopregnant female foster animal and theembryo brought to term. Progeny harboring the homologously recombinedDNA in their germ cells can be used to breed animals in which all cellsof the animal contain the homologously recombined DNA by germlinetransmission of the transgene. Methods for constructing homologousrecombination vectors and homologous recombinant animals are describedfurther in Bradley, A. (1991) Current Opinion in Biotechnology 2:823-829and in PCT International Publication Nos. WO 90/11354; WO 91/01140; andWO 93/04169.

The genetically engineered host cells can be used to produce non-humantransgenic animals. A transgenic animal is preferably a mammal, forexample a rodent, such as a rat or mouse, in which one or more of thecells of the animal include a transgene. A transgene is exogenous DNAwhich is integrated into the genome of a cell from which a transgenicanimal develops and which remains in the genome of the mature animal inone or more cell types or tissues of the transgenic animal. Theseanimals are useful for studying the function of an ADH protein andidentifying and evaluating modulators of ADH protein activity.

Other examples of transgenic animals include non-human primates, sheep,dogs, cows, goats, chickens, and amphibians.

In one embodiment, a host cell is a fertilized oocyte or an embryonicstem cell into which ADH polynucleotide sequences have been introduced.

A transgenic animal can be produced by introducing nucleic acid into themale pronuclei of a fertilized oocyte, e.g., by microinjection,retroviral infection, and allowing the oocyte to develop in apseudopregnant female foster animal. Any of the ADH nucleotide sequencescan be introduced as a transgene into the genome of a non-human animal,such as a mouse.

Any of the regulatory or other sequences useful in expression vectorscan form part of the transgenic sequence. This includes intronicsequences and polyadenylation signals, if not already included. Atissue-specific regulatory sequence(s) can be operably linked to thetransgene to direct expression of the ADH protein to particular cells.

Methods for generating transgenic animals via embryo manipulation andmicroinjection, particularly animals such as mice, have becomeconventional in the art and are described, for example, in U.S. Pat.Nos. 4,736,866 and 4,870,009, both by Leder et al., U.S. Pat. No.4,873,191 by Wagner et al. and in Hogan, B., Manipulating the MouseEmbryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.,1986). Similar methods are used for production of other transgenicanimals. A transgenic founder animal can be identified based upon thepresence of the transgene in its genome and/or expression of transgenicmRNA in tissues or cells of the animals. A transgenic founder animal canthen be used to breed additional animals carrying the transgene.Moreover, transgenic animals carrying a transgene can further be bred toother transgenic animals carrying other transgenes. A transgenic animalalso includes animals in which the entire animal or tissues in theanimal have been produced using the homologously recombinant host cellsdescribed herein.

In another embodiment, transgenic non-human animals can be producedwhich contain selected systems, which allow for regulated expression ofthe transgene. One example of such a system is the cre/loxP recombinasesystem of bacteriophage P1. For a description of the cre/loxPrecombinase system, see, e.g., Lakso et al. (1992) PNAS 89:6232-6236.Another example of a recombinase system is the FLP recombinase system ofS. cerevisiae (O'Gorman et al. (1991) Science 251:1351-1355. If acre/loxP recombinase system is used to regulate expression of thetransgene, animals containing transgenes encoding both the Crerecombinase and a selected protein is required. Such animals can beprovided through the construction of “double” transgenic animals, e.g.,by mating two transgenic animals, one containing a transgene encoding aselected protein and the other containing a transgene encoding arecombinase.

Clones of the non-human transgenic animals described herein can also beproduced according to the methods described in Wilmut et al. (1997)Nature 385:810-813 and PCT International Publication Nos. WO 97/07668and WO 97/07669. In brief, a cell, e.g., a somatic cell, from thetransgenic animal can be isolated and induced to exit the growth cycleand enter G₀ phase. The quiescent cell can then be fused, e.g., throughthe use of electrical pulses, to an enucleated oocyte from an animal ofthe same species from which the quiescent cell is isolated. Thereconstructed oocyte is then cultured such that it develops to morula orblastocyst and then transferred to a pseudopregnant female fosteranimal. The offspring born of this female foster animal will be a cloneof the animal from which the cell, e.g., the somatic cell, is isolated.

Transgenic animals containing recombinant cells that express thepolypeptides described herein are useful to conduct the assays describedherein in an in vivo context. Accordingly, the various physiologicalfactors that are present in vivo and that could affect substrate andcoenzyme binding, and oxidation of the substrate may not be evident fromin vitro cell-free or cell-based assays. Accordingly, it is useful toprovide non-human transgenic animals to assay in vivo ADH function,including substrate and coenzyme interactions and substrate oxidation.Similar methods could be used to determine the effect of specific mutantADHs and the effect of chimeric ADHs on such enzyme functions. It isalso possible to assess the effect of null mutations, that is mutationsthat substantially or completely eliminate one or more ADH functions.

In general, methods for producing transgenic animals include introducinga nucleic acid sequence according to the present invention, the nucleicacid sequence capable of expressing the ADH protein in a transgenicanimal, into a cell in culture or in vivo. When introduced in vivo, thenucleic acid is introduced into an intact organism such that one or morecell types and, accordingly, one or more tissue types, express thenucleic acid encoding the ADH protein. Alternatively, the nucleic acidcan be introduced into virtually all cells in an organism bytransfecting a cell in culture, such as an embryonic stem cell, asdescribed herein for the production of transgenic animals, and this cellcan be used to produce an entire transgenic organism. As described, in afurther embodiment, the host cell can be a fertilized oocyte. Such cellsare then allowed to develop in a female foster animal to produce thetransgenic organism.

Pharmaceutical Compositions

The ADH nucleic acid molecules, protein (such as an extracellular loop),modulators of the protein, and antibodies (also referred to herein as“active compounds”) can be incorporated into pharmaceutical compositionssuitable for administration to a subject, e.g., a human. Suchcompositions typically comprise the nucleic acid molecule, protein,modulator, or antibody and a pharmaceutically acceptable carrier.

The term “administer” is used in its broadest sense and includes anymethod of introducing the compositions of the present invention into asubject. This includes producing polypeptides or polynucleotides in vivoas by transcription or translation, in vivo, of polynucleotides thathave been exogenously introduced into a subject. Thus, polypeptides ornucleic acids produced in the subject from the exogenous compositionsare encompassed in the term “administer.”

As used herein the language “pharmaceutically acceptable carrier” isintended to include any and all solvents, dispersion media, coatings,antibacterial and antifungal agents, isotonic and absorption delayingagents, and the like, compatible with pharmaceutical administration. Theuse of such media and agents for pharmaceutically active substances iswell known in the art. Except insofar as any conventional media or agentis incompatible with the active compound, such media can be used in thecompositions of the invention. Supplementary active compounds can alsobe incorporated into the compositions.

A pharmaceutical composition of the invention is formulated to becompatible with its intended route of administration. Examples of routesof administration include parenteral, e.g., intravenous, intradermal,subcutaneous, oral (e.g., inhalation), transdermal (topical),transmucosal, and rectal administration. Solutions or suspensions usedfor parenteral, intradermal, or subcutaneous application can include thefollowing components: a sterile diluent such as water for injection,saline solution, fixed oils, polyethylene glycols, glycerine, propyleneglycol or other synthetic solvents; antibacterial agents such as benzylalcohol or methyl parabens; antioxidants such as ascorbic acid or sodiumbisulfite; chelating agents such as ethylenediaminetetraacetic acid;buffers such as acetates, citrates or phosphates and agents for theadjustment of tonicity such as sodium chloride or dextrose. pH can beadjusted with acids or bases, such as hydrochloric acid or sodiumhydroxide. The parenteral preparation can be enclosed in ampules,disposable syringes or multiple dose vials made of glass or plastic.

Pharmaceutical compositions suitable for injectable use include sterileaqueous solutions (where water soluble) or dispersions and sterilepowders for the extemporaneous preparation of sterile injectablesolutions or dispersion. For intravenous administration, suitablecarriers include physiological saline, bacteriostatic water, CremophorEL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In allcases, the composition must be sterile and should be fluid to the extentthat easy syringability exists. It must be stable under the conditionsof manufacture and storage and must be preserved against thecontaminating action of microorganisms such as bacteria and fungi. Thecarrier can be a solvent or dispersion medium containing, for example,water, ethanol, polyol (for example, glycerol, propylene glycol, andliquid polyethylene glycol, and the like), and suitable mixturesthereof. The proper fluidity can be maintained, for example, by the useof a coating such as lecithin, by the maintenance of the requiredparticle size in the case of dispersion and by the use of surfactants.Prevention of the action of microorganisms can be achieved by variousantibacterial and antifungal agents, for example, parabens,chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In manycases, it will be preferable to include isotonic agents, for example,sugars, polyalcohols such as mannitol, sorbitol, sodium chloride in thecomposition. Prolonged absorption of the injectable compositions can bebrought about by including in the composition an agent which delaysabsorption, for example, aluminum monostearate and gelatin.

Sterile injectable solutions can be prepared by incorporating the activecompound (e.g., an ADH protein or anti-ADH antibody) in the requiredamount in an appropriate solvent with one or a combination ofingredients enumerated above, as required, followed by filteredsterilization. Generally, dispersions are prepared by incorporating theactive compound into a sterile vehicle which contains a basic dispersionmedium and the required other ingredients from those enumerated above.In the case of sterile powders for the preparation of sterile injectablesolutions, the preferred methods of preparation are vacuum drying andfreeze-drying which yields a powder of the active ingredient plus anyadditional desired ingredient from a previously sterile-filteredsolution thereof.

Oral compositions generally include an inert diluent or an ediblecarrier. They can be enclosed in gelatin capsules or compressed intotablets. For oral administration, the agent can be contained in entericforms to survive the stomach or further coated or mixed to be releasedin a particular region of the GI tract by known methods. For the purposeof oral therapeutic administration, the active compound can beincorporated with excipients and used in the form of tablets, troches,or capsules. Oral compositions can also be prepared using a fluidcarrier for use as a mouthwash, wherein the compound in the fluidcarrier is applied orally and swished and expectorated or swallowed.Pharmaceutically compatible binding agents, and/or adjuvant materialscan be included as part of the composition. The tablets, pills,capsules, troches and the like can contain any of the followingingredients, or compounds of a similar nature: a binder such asmicrocrystalline cellulose, gum tragacanth or gelatin; an excipient suchas starch or lactose, a disintegrating agent such as alginic acid,Primogel, or corn starch; a lubricant such as magnesium stearate orSterotes; a glidant such as colloidal silicon dioxide; a sweeteningagent such as sucrose or saccharin; or a flavoring agent such aspeppermint, methyl salicylate, or orange flavoring.

For administration by inhalation, the compounds are delivered in theform of an aerosol spray from pressured container or dispenser, whichcontains a suitable propellant, e.g., a gas such as carbon dioxide, or anebulizer.

Systemic administration can also be by transmucosal or transdermalmeans. For transmucosal or transdermal administration, penetrantsappropriate to the barrier to be permeated are used in the formulation.Such penetrants are generally known in the art, and include, forexample, for transmucosal administration, detergents, bile salts, andfusidic acid derivatives. Transmucosal administration can beaccomplished through the use of nasal sprays or suppositories. Fortransdermal administration, the active compounds are formulated intoointments, salves, gels, or creams as generally known in the art.

The compounds can also be prepared in the form of suppositories (e.g.,with conventional suppository bases such as cocoa butter and otherglycerides) or retention enemas for rectal delivery.

In one embodiment, the active compounds are prepared with carriers thatwill protect the compound against rapid elimination from the body, suchas a controlled release formulation, including implants andmicroencapsulated delivery systems. Biodegradable, biocompatiblepolymers can be used, such as ethylene vinyl acetate, polyanhydrides,polyglycolic acid, collagen, polyorthoesters, and polylactic acid.Methods for preparation of such formulations will be apparent to thoseskilled in the art. The materials can also be obtained commercially fromAlza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions(including liposomes targeted to infected cells with monoclonalantibodies to viral antigens) can also be used as pharmaceuticallyacceptable carriers. These can be prepared according to methods known tothose skilled in the art, for example, as described in U.S. Pat. No.4,522,811.

It is especially advantageous to formulate oral or parenteralcompositions in dosage unit form for ease of administration anduniformity of dosage. “Dosage unit form” as used herein refers tophysically discrete units suited as unitary dosages for the subject tobe treated; each unit containing a predetermined quantity of activecompound calculated to produce the desired therapeutic effect inassociation with the required pharmaceutical carrier. The specificationfor the dosage unit forms of the invention are dictated by and directlydependent on the unique characteristics of the active compound and theparticular therapeutic effect to be achieved, and the limitationsinherent in the art of compounding such an active compound for thetreatment of individuals.

The nucleic acid molecules of the invention can be inserted into vectorsand used as gene therapy vectors. Gene therapy vectors can be deliveredto a subject by, for example, intravenous injection, localadministration (U.S. Pat. No. 5,328,470) or by stereotactic injection(see e.g., Chen et al. (1994) PNAS 91:3054-3057). The pharmaceuticalpreparation of the gene therapy vector can include the gene therapyvector in an acceptable diluent, or can comprise a slow release matrixin which the gene delivery vehicle is imbedded. Alternatively, where thecomplete gene delivery vector can be produced intact from recombinantcells, e.g. retroviral vectors, the pharmaceutical preparation caninclude one or more cells which produce the gene delivery system.

The pharmaceutical compositions can be included in a container, pack, ordispenser together with instructions for administration.

As defined herein, a therapeutically effective amount of protein orpolypeptide (i.e., an effective dosage) ranges from about 0.001 to 30mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, morepreferably about 0.1 to 20 mg/kg body weight, and even more preferablyabout 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6mg/kg body weight.

The skilled artisan will appreciate that certain factors may influencethe dosage required to effectively treat a subject, including but notlimited to the severity of the disease or disorder, previous treatments,the general health and/or age of the subject, and other diseasespresent. Moreover, treatment of a subject with a therapeuticallyeffective amount of a protein, polypeptide, or antibody can include asingle treatment or, preferably, can include a series of treatments. Ina preferred example, a subject is treated with antibody, protein, orpolypeptide in the range of between about 0.1 to 20 mg/kg body weight,one time per week for between about 1 to 10 weeks, preferably between 2to 8 weeks, more preferably between about 3 to 7 weeks, and even morepreferably for about 4, 5, or 6 weeks. It will also be appreciated thatthe effective dosage of antibody, protein, or polypeptide used fortreatment may increase or decrease over the course of a particulartreatment. Changes in dosage may result and become apparent from theresults of diagnostic assays as described herein.

The present invention encompasses agents which modulate expression oractivity. An agent may, for example, be a small molecule. For example,such small molecules include, but are not limited to, peptides,peptidomimetics, amino acids, amino acid analogs, polynucleotides,polynucleotide analogs, nucleotides, nucleotide analogs, organic orinorganic compounds (i.e., including heteroorganic and organometalliccompounds) having a molecular weight less than about 10,000 grams permole, organic or inorganic compounds having a molecular weight less thanabout 5,000 grams per mole, organic or inorganic compounds having amolecular weight less than about 1,000 grams per mole, organic orinorganic compounds having a molecular weight less than about 500 gramsper mole, and salts, esters, and other pharmaceutically acceptable formsof such compounds.

It is understood that appropriate doses of small molecule agents dependsupon a number of factors within the ken of the ordinarily skilledphysician, veterinarian, or researcher. The dose(s) of the smallmolecule will vary, for example, depending upon the identity, size, andcondition of the subject or sample being treated, further depending uponthe route by which the composition is to be administered, if applicable,and the effect which the practitioner desires the small molecule to haveupon the nucleic acid or polypeptide of the invention. Exemplary dosesinclude milligram or microgram amounts of the small molecule perkilogram of subject or sample weight (e.g., about 1 microgram perkilogram to about 500 milligrams per kilogram, about 100 micrograms perkilogram to about 5 milligrams per kilogram, or about 1 microgram perkilogram to about 50 micrograms per kilogram. It is furthermoreunderstood that appropriate doses of a small molecule depend upon thepotency of the small molecule with respect to the expression or activityto be modulated. Such appropriate doses may be determined using theassays described herein. When one or more of these small molecules is tobe administered to an animal (e.g., a human) in order to modulateexpression or activity of a polypeptide or nucleic acid of theinvention, a physician, veterinarian, or researcher may, for example,prescribe a relatively low dose at first, subsequently increasing thedose until an appropriate response is obtained. In addition, it isunderstood that the specific dose level for any particular animalsubject will depend upon a variety of factors including the activity ofthe specific compound employed, the age, body weight, general health,gender, and diet of the subject, the time of administration, the routeof administration, the rate of excretion, any drug combination, and thedegree of expression or activity to be modulated.

This invention may be embodied in many different forms and should not beconstrued as limited to the embodiments set forth herein; rather, theseembodiments are provided so that this disclosure will fully convey theinvention to those skilled in the art. Many modifications and otherembodiments of the invention will come to mind in one skilled in the artto which this invention pertains having the benefit of the teachingspresented in the foregoing description. Although specific terms areemployed, they are used as in the art unless otherwise indicated.

CHAPTER 3 23484, a Novel Human Ubiquitin Protease BACKGROUND OF THEINVENTION The Ubiquitin System

Several biological processes are controlled by the ubiquitination ofcellular protein. Cellular processes that are affected by ubiquitinmodification include the regulation of gene expression, regulation ofthe cell cycle and cell division, cellular housekeeping, cell-specificmetabolic pathways, disposal of mutated or post-translationally damagedproteins, the cellular stress response, modification of cell surfacereceptors, DNA repair, import of proteins into mitochondria, uptake ofprecursors into neurons, biogenesis of mitochondria, ribosomes, andperoxisomes, apoptosis, and growth factor-mediated signal transduction.

For some protein substrates ubiquitination leads to protein degradationby the 26S proteasomal complex. A wide variety of protein substrates aredegraded by the 26S proteasomal complex following ubiquitination of thesubstrate. Degradation of a protein by the ubiquitin system involves twosteps. The first involves the covalent attachment of multiple ubiquitinmolecules to the substrate protein. The second involves degradation ofthe ubiquitinated protein by the 26S proteasome. In some cases,degradation of the ubiquitinated protein can occur by means of thelysosomal pathway.

The 26S proteasome comprises a 20S core catalytic complex which isflanked by two 19S regulatory complexes. The 26S complex recognizesubiquitinated proteins. Substrate recognition by the 26S proteasome,however, may be mediated by the interaction of specific subunits of the19S complex with the ubiquitin chain. The ubiquitinated protein isdegraded by specific and energy-dependent proteases into free aminoacids and free and reutilizable ubiquitin.

The 19S regulatory complex consists of many subunits that can beclassified into ATPases and non-ATPases. This complex is thought to actin recognition, unfolding, and translocation of the substrates into the20S proteasome for proteolysis. The regulatory complex containsisopeptidases capable of deubiquitinating substrates (Spataro et al.(1998) British Journal of Cancer 77:448-455).

The ubiquitin proteasome pathway functions to degrade abnormal proteins,short-lived normal proteins, long-lived normal proteins, and proteins ofthe endoplasmic recticulum. Important regulatory proteins rapidlyinactivated by proteolysis include c-JUN, c-FOS, and p53 (Lecker et al.(1999) Journa of Nutrition 129:227 S-237S). Conditions that stimulateprotein degradation by the ubiquitin proteasome pathway include eatingdisorders, renal tubular defects, diabetes, uremia, neuromusculardisease, immobilization, burn injuries, sepsis, cancer, cachexia,hyperadrenocortisolism and hyperthyroidism.

Cellular proteins degraded by the ubiquitin system include cell cycleregulators, including mitotic cyclins, G1 cyclins, CDK inhibitors,anaphase inhibitors, transcription factors, tumor suppressors, andoncoproteins such as NF-κB and IκBα, p53, JUN, β-catenin, E2F-1, andmembrane proteins such as Step 2p, GH receptor, T-cell receptor,platelet-derived growth factor, lymphocyte homing receptor, MET tyrosinekinase receptor, hepatocyte growth factor-scatter factor, connexin 43,the high affinity IgE receptor, the prolactin receptor, and the EGFreceptor (Hershko et al. (1998) Annual Review of Biochemistry67:425-479).

Ubiquitination does not only result in proteolytic degradation. For someprotein substrates, ubiquitination is a reversible post-translationalmodification that can regulate cellular targeting and enzymaticactivity. This includes targeting to the vacuole, activation of enzymeactivity, such as Ikβ kinase activation, and activation of cytokinereceptor-mediated signal transduction (D'Andrea et al. (1998) CriticalReviews In Biochemistry and Molecular Biology 33:337-352). The T-cellreceptor undergoes ubiquitination in response to receptor engagement.Platelet derived growth factor undergoes multiple ubiquitinationfollowing ligand binding. Soluble steel factor has been shown tostimulate rapid polyubiquitination of the c-KIT receptor.

It has been shown that protein degradation accounts for regulation ofproteins such as cyclins, cyclin-dependent kinase inhibitors, p53, c-JUNand c-FOS (Spitaro et al. above). The ubiquitin system has also shown tobe involved in antigen presentation. The 26S proteasome is responsiblefor processing MHC-restricted class I antigens (Spitaro et al. above).

The ubiquitin system has been implicated in various diseases. One groupincludes pathology that results from loss of function, a mutation in anenzyme or substrate that leads to stabilization of the protein andconsequent build up of a protein to abnormally high levels. The secondinvolves pathologies that result from a gain of function that producesincreased protein degradation.

The ubiquitin system has been implicated in various malignancies. Incervical carcinoma, low levels of p53 have been found. This protein istargeted for degradation by HPV E6-associated protein. Removal of thesuppressor by this oncoprotein may be a mechanism utilized by the virusto transform cells. Other results have shown that c-JUN, but not thetransforming counterpart, v-JUN, is ubiquitinated and subsequentlydegraded. Other studies show that low levels of p27, a cell divisionkinase inhibitor whose degradation is necessary for proper cell cycleprogression, is correlated with colorectal, and breast carcinomas. Thelow level of this enzyme is due to activation of the ubiquitin system.

Human genetic diseases involving aberrant proteolysis have been reviewed(Kato (1999) Human Mutation 13:87-98). Cystic fibrosis has beencorrelated with the ubiquitin system. The cystic fibrosis transmembraneregulator in cystic fibrosis patients is almost completely degraded bythe ubiquitin system so that an abnormally low amount of the wild typeprotein is found on the cell surface. In Angelman's syndrome, one of theenzymes involved in ubiquitination (E3) is affected. In Liddle syndrome,the E3 enzyme is also affected.

The ubiquitin system can also affect the immune and inflammatoryresponse. The persistence of EBNA-1 contributes to some virus relatedpathologies. A sequence on this protein was found to inhibit degradationby the ubiquitin system. This inhibited processing and subsequentpresentation of viral epitopes by MHC protein.

The ubiquitin system has also been implicated in neurodegenerativediseases. Ubiquitin immunohistochemistry has shown enrichment ofubiquitin conjugates in senile plaques, lysosomes, endosomes, and avariety of inclusion bodies and degenerative fibers in manyneurodegenerative diseases, such as Alzheimer's, Parkinson's and Lewybody diseases, amyotrophic lateral sclerosis, and Creutzfeld-Jakobdisease. Further, in Huntington disease and spinocerebellar ataxias, theproteins encoded by the affected genes aggregate in ubiquitin- andproteasome-positive intranuclear inclusion bodies.

The ubiquitin system has been associated with muscle wasting (Mitch etal. (1999) American Journal of Physiology 276:C1132-C1138 and Lecker etal. above) and muscle-wasting diseases and in such pathological statesas fasting, starvation, sepsis, and denervation, all of which resultfrom accelerated ubiquitin-mediated proteolysis (see Ciechanover, EMBOJournal 17:7151-7160 (1998)).

The ubiquitin system is also involved in development. The involvement inhuman brain development is indicated by the fact that a mutation in anE3 enzyme is implicated as the cause of Angelman's syndrome, a disordercharacterized by mental retardation, seizures, and abnormal gait(Hershko et al. above).

The ubiquitin system is also associated with apoptosis.Ubiquitin-proteasome-mediated proteolysis is reported to play animportant role in apoptosis of nerve growth factor-deprived neurons(Sadoul et al. (1996) EMBO Journal 15:3845-3852). One of the first genesshown to be involved in programmed cell death is the polyubiquitin genethat is regulated during metamorphosis of Manduca sexta.Radiation-induced apoptosis in human lymphocytes has been shown to beaccompanied by increased ubiquitin mRNA and ubiquitinylated nuclearproteins. Further, drugs that interfere with proteasome function, suchas lactacystin, prevent radiation-induced cell death of thymocytes(Hershko et al. above).

Deubiquitinating Enzymes

Deubiquitinating enzymes are cysteine proteases that specifically cleaveubiquitin conjugates at the ubiquitin carboxy terminus. These enzymesare responsible for processing linear polyubiquitin chains to generatefree ubiquitin from precursor fusion proteins. They also affect pools offree ubiquitin by recycling branched chain ubiquitin. These enzymes alsoremove ubiquitin from ubiquitin- and polyubiquitin-conjugated targetprotein, thereby regulating localization or activity of the target.Further, these enzymes can remove ubiquitin from a ubiquitinated taggedprotein and thereby rescue the protein from degradation by the 26Sproteasome. The end result of each of these activities, is to affect thelevel of free intracellular ubiquitin (D'Andrea et al., above) and thelevel of specific proteins.

Ubiquitin is synthesized in a variety of functionally-distinct forms.One of these is a linear head-to-tail polyubiquitin precursor. Releaseof the free molecules involves specific enzymatic cleavage between thefused residues. The last ubiquitin moiety in many of these precursors isencoded with an extra C-terminal residue that must be removed to exposethe active C-terminal Gly. In general, the recycling enzymes are thiolproteases that recognize the C-terminal domain/residue of ubiquitin.These are divided into two classes. The first is designated ubiquitinC-terminal hydrolase (UCH) and the second is designatedubiquitin-specific protease (UBP; isopeptidases) (Ciechanover, above).These enzymes have been reviewed in detail in D'Andrea, above.

UBPs contain six conserved regions. One surrounds the conservedcysteine, one surrounds the aspartic acid, one surrounds the histidine,and three additional regions of unknown function have been identified.These six domains provide a molecular signature for the UBP family.Short sequences surrounding the cysteine residue and histidine residueare highly conserved among all UBPs. Sequence comparison of several UBPfamily members reveals that there are various subfamilies. Onesubfamily, designated DUB, contains enzymes that are transcriptionallyinduced in response to cytokines. The UBP family contains enzymes whosemembers have multiple ubiquitin binding sites. Identified members ofthis family include DUB1, isoT, UBP3, Doa4, Tre2, and FAF (D'Andrea etal. above).

The UCH family is distinct from the UBP family. These enzymes arecysteine proteases but do not contain the six homology domainscharacteristic of the UBP family. Further, there is only one bindingsite for ubiquitin. With respect to substrate specificity, the UCHfamily preferentially cleaves ubiquitin from small molecules, such aspeptides and amino acids. Further, the two families share littlesequence homology with each other, although the UCH signature can befound in some UBPs.

The deubiquitinating enzymes can promote either degradation orstabilization of a given substrate. One of the best characterizeddeubiquitinating enzymes is the yeast UBP14p enzyme which has a humanhomolog designated isopeptidase-T. Isopeptidase-T hydrolyzes freepolyubiquitin chains and stimulates degradation of polyubiquitinatedprotein substrates by the 26S proteasome. In vitro data suggest that thecellular role of isopeptidase-T is to dissemble unanchored polyubiquitinchains. The isopeptidase-T then sequentially degrades thesepolyubiquitin chains into ubiquitin monomers.

The yeast Doa4 promotes ubiquitin-mediated proteolysis of cellularsubstrates. The primary function appears to be the hydrolysis ofisopeptide-linked ubiquitin chains from peptides that are theby-products of proteasome degradation. The function appears to be theclipping of polymeric ubiquitin from peptide degradation products. Insummary, with respect to a degradation function, isopeptidases canproduce free ubiquitin monomers from straight chain polyubiquitin,branched chain polyubiquitin, ubiquitin or polyubiquitin attached tosubstrate proteins, and ubiquitin or polyubiquitin attached to substrateremnants, such as peptides or amino acids.

Deubiquitinating enzymes that promote stabilization of substratesinclude the FAF protein. Results show that the FAF proteindeubiquitinates and rescues a ubiquitin-conjugated target, preventingits degradation by the proteasome. Another deubiquitinating enzyme,designated PA700 isospeptidase, also prevents proteasome degradation.This enzyme has been isolated from the 19S regulatory complex. Thisenzyme appears to remove one ubiquitin at a time starting from thedistal end of a polyubiquitin chain.

The enzymes have been associated with growth control. The mammalianoncoprotein Tre-2 is a member of the UBP superfamily. The transformingisoform of the Tre-2 oncoprotein is a truncated UPB lacking thehistidine domain and lacking deubiquitinating activity. The full lengthTre-2 protein has deubiquitinating activity but no transformingactivity. Accordingly, it has been suggested that this protein acts as agrowth suppressor within the cell.

Another UBP that regulates cellular function is designated DUB. DUB-1was originally shown to be induced by interleukin-3 stimulation. It hasbeen postulated that the DUB protein family is generally responsive tocytokines. It has also been shown that another family member, DUB-2, isinduced by interleukin-2. Zhu et al. (1997) Journal of BiologicalChemistry 272:51-57.

The enzymes may deubiquitinate cell surface growth factor receptorsthereby prolonging receptor half life and amplifying growth signals.They may also deubiquitinate proteins involved in signal transductionand deubiquitinate cell cycle regulators such as cyclins or cyclin-CDKinhibitors. See D'Andrea above.

UBPs have also been linked to the chromatin regulatory process,transcriptional silencing. UBP-3 has been reported to complex withSIR-4, a trans-acting factor that is required for establishment andmaintenance of silencing. Accordingly, UBP-3 may act as an inhibitor ofsilencing by either stabilizing an inhibitor or by removing a positiveregulator.

The murine UNP protooncogene has been shown to encode a nuclearubiquitin protease whose overexpression leads to oncogenictransformation in NIH3T3 cells. A cDNA was cloned corresponding to thehuman homolog of this gene. It was shown to map to a region frequentlyrearranged in human tumor cells. Further, it was shown that levels ofthis gene are elevated in small cell tumors and adenocarcinomas of thelung, suggesting a causative role of the gene in the neoplastic process(Gray et al. (1995) Oncogene 10:2179-2183).

A novel ubiquitin-specific protease, designated UBP-43, was cloned froma leukemia fusion protein in AML1-ETO Knockin mice. This protease wasshown to function in hematopoitic cell differentiation. Theoverexpression of this gene was shown to block cytokine-induced terminaldifferentiation of monocytic cells (Liu et al. (1999) Molecular andCellular Biology 19:3029-3038).

In summary, deubiquitinating enzymes are potentially powerful targetsfor modulating ubiquitination. Modulation of ubiquitination can increaseor decrease the proteolysis of specific proteins, particularly keyproteins in cellular processes, can increase or decrease levels ofgeneral proteolysis, thus affecting the basic metabolic state, and mayincrease or decrease the pool of free ubiquitin monomers available forubiquitination.

Accordingly, ubiquitin proteases are a major target for drug action anddevelopment. Thus, it is valuable to the field of pharmaceuticaldevelopment to identify and characterize previously unknown ubiquitinproteases. The present invention advances the state of the art byproviding a previously unidentified human deubiquitinating enzyme.

SUMMARY OF THE INVENTION

It is an object of the invention to identify novel ubiquitin proteases.

It is a further object of the invention to provide novel ubiquitinprotease polypeptides that are useful as reagents or targets in assaysapplicable to treatment and diagnosis of ubiquitin-mediated or -relateddisorders, especially disorders mediated by or related todeubiquitinating enzymes.

It is a further object of the invention to provide polynucleotidescorresponding to the novel ubiquitin protease polypeptides that areuseful as targets and reagents in assays applicable to treatment anddiagnosis of ubiquitin or ubiquitin protease-mediated or -relateddisorders and useful for producing novel ubiquitin protease polypeptidesby recombinant methods.

A specific object of the invention is to identify compounds that act asagonists and antagonists and modulate the expression of the novelubiquitin protease.

A further specific object of the invention is to provide compounds thatmodulate expression of the ubiquitin protease for treatment anddiagnosis of ubiquitin and ubiquitin protease-related disorders.

The invention is thus based on the identification of a novel humanubiquitin protease. The amino acid sequence is shown in SEQ ID NO:15.The nucleotide sequence is shown in SEQ ID NO:16.

The invention provides isolated ubiquitin protease polypeptides,including a polypeptide having the amino acid sequence shown in SEQ IDNO:15 or the amino acid sequence encoded by the cDNA deposited as ATCCNo. PTA-1849 on May 9, 2000 (“the deposited cDNA”).

The invention also provides isolated ubiquitin protease nucleic acidmolecules having the sequence shown in SEQ ID NO:16 or in the depositedcDNA.

The invention also provides variant polypeptides having an amino acidsequence that is substantially homologous to the amino acid sequenceshown in SEQ ID NO:15 or encoded by the deposited cDNA.

The invention also provides variant nucleic acid sequences that aresubstantially homologous to the nucleotide sequence shown in SEQ IDNO:16 or in the deposited cDNA.

The invention also provides fragments of the polypeptide shown in SEQ IDNO:15 and nucleotide sequence shown in SEQ ID NO:16, as well assubstantially homologous fragments of the polypeptide or nucleic acid.

The invention further provides nucleic acid constructs comprising thenucleic acid molecules described herein. In a preferred embodiment, thenucleic acid molecules of the invention are operatively linked to aregulatory sequence.

The invention also provides vectors and host cells for expressing theubiquitin protease nucleic acid molecules and polypeptides, andparticularly recombinant vectors and host cells.

The invention also provides methods of making the vectors and host cellsand methods for using them to produce the ubiquitin protease nucleicacid molecules and polypeptides.

The invention also provides antibodies or antigen-binding fragmentsthereof that selectively bind the ubiquitin protease polypeptides andfragments.

The invention also provides methods of screening for compounds thatmodulate expression or activity of the ubiquitin protease polypeptidesor nucleic acid (RNA or DNA).

The invention also provides a process for modulating ubiquitin proteasepolypeptide or nucleic acid expression or activity, especially using thescreened compounds. Modulation may be used to treat conditions relatedto aberrant activity or expression of the ubiquitin proteasepolypeptides or nucleic acids or of the ubiquitin system. In addition,modulation may be used to treat conditions, such as viral infection,that are affected by the ubiquitin protease.

The invention also provides assays for determining the activity of orthe presence or absence of the ubiquitin protease polypeptides ornucleic acid molecules in a biological sample, including for diseasediagnosis.

The invention also provides assays for determining the presence of amutation in the polypeptides or nucleic acid molecules, including fordisease diagnosis.

In still a further embodiment, the invention provides a computerreadable means containing the nucleotide and/or amino acid sequences ofthe nucleic acids and polypeptides of the invention, respectively.

DETAILED DESCRIPTION OF THE INVENTION

The present inventions now will be described more fully hereinafter withreference to the accompanying drawings, in which some, but not allembodiments of the invention are shown. Indeed, these inventions may beembodied in many different forms and should not be construed as limitedto the embodiments set forth herein; rather, these embodiments areprovided so that this disclosure will satisfy applicable legalrequirements. Like numbers refer to like elements throughout.

Many modifications and other embodiments of the inventions set forthherein will come to mind to one skilled in the art to which theseinventions pertain having the benefit of the teachings presented in theforegoing descriptions and the associated drawings. Therefore, it is tobe understood that the inventions are not to be limited to the specificembodiments disclosed and that modifications and other embodiments areintended to be included within the scope of the appended claims.Although specific terms are employed herein, they are used in a genericand descriptive sense only and not for purposes of limitation.

Polypeptides

The invention is based on the identification of a novel human ubiquitinprotease. Specifically, an expressed sequence tag (EST) was selectedbased on homology to ubiquitin protease sequences. This EST was used todesign primers based on sequences that it contains and used to identifya cDNA from a human prostate library. Positive clones were sequenced andthe overlapping fragments were assembled. Analysis of the assembledsequence revealed that the cloned cDNA molecule encodes a ubiquitinprotease containing the conserved amino acid residues found in UBP andUCH thiol proteases.

The invention thus relates to a novel ubiquitin protease having thededuced amino acid sequence shown in FIGS. 33A-33D (SEQ ID NO:15) orhaving the amino acid sequence encoded by the deposited cDNA, ATCC No.PTA-1849 on May 9, 2000.

The deposits will be maintained under the terms of the Budapest Treatyon the International Recognition of the Deposit of Microorganisms. Thedeposits are provided as a convenience to those of skill in the art andare not an admission that a deposit is required under 35 U.S.C. § 112.The deposited sequences, as well as the polypeptides encoded by thesequences, are incorporated herein by reference and controls in theevent of any conflict, such as a sequencing error, with description inthis application.

“Ubiquitin protease polypeptide” or “ubiquitin protease protein” refersto the polypeptide in SEQ ID NO:15 or encoded by the deposited cDNA. Theterm “ubiquitin protease protein” or “ubiquitin protease polypeptide”,however, further includes the numerous variants described herein, aswell as fragments derived from the full-length ubiquitin proteases andvariants.

Tissues and/or cells in which the ubiquitin protease is expressedinclude, but are not limited to those shown in FIGS. 37 and 38. Tissuesin which the gene is highly expressed include fetal kidney, testes,fetal liver, ovary, and fetal heart. Expression is also seen in thekidney, thyroid, undifferentiated osteoblasts and skeletal muscle. Theubiquitin protease is also expressed in normal liver and in normal andmalignant breast, lung, and colon tissue and in liver metastases derivedfrom malignant colonic tissues. Hence, the ubiquitin protease isrelevant to disorders involving the tissues in which it is expressed,especially in breast, lung, colon, and colon metastases to liver.Expression has been confirmed by Northern blot analysis.

The present invention thus provides an isolated or purified ubiquitinprotease polypeptide and variants and fragments thereof.

Based on a BLAST search, highest homology was shown to UbiquitinCarboxyl-terminal hydrolase (AL031525) from S. pombe.

As used herein, a polypeptide is said to be “isolated” or “purified”when it is substantially free of cellular material when it is isolatedfrom recombinant and non-recombinant cells, or free of chemicalprecursors or other chemicals when it is chemically synthesized. Apolypeptide, however, can be joined to another polypeptide with which itis not normally associated in a cell and still be considered “isolated”or “purified.”

The ubiquitin protease polypeptides can be purified to homogeneity. Itis understood, however, that preparations in which the polypeptide isnot purified to homogeneity are useful and considered to contain anisolated form of the polypeptide. The critical feature is that thepreparation allows for the desired function of the polypeptide, even inthe presence of considerable amounts of other components. Thus, theinvention encompasses various degrees of purity.

In one embodiment, the language “substantially free of cellularmaterial” includes preparations of the ubiquitin protease having lessthan about 30% (by dry weight) other proteins (i.e., contaminatingprotein), less than about 20% other proteins, less than about 10% otherproteins, or less than about 5% other proteins. When the polypeptide isrecombinantly produced, it can also be substantially free of culturemedium, i.e., culture medium represents less than about 20%, less thanabout 10%, or less than about 5% of the volume of the proteinpreparation.

An ubiquitin protease polypeptide is also considered to be isolated whenit is part of a membrane preparation or is purified and thenreconstituted with membrane vesicles or liposomes.

The language “substantially free of chemical precursors or otherchemicals” includes preparations of the ubiquitin protease polypeptidein which it is separated from chemical precursors or other chemicalsthat are involved in its synthesis. In one embodiment, the language“substantially free of chemical precursors or other chemicals” includespreparations of the polypeptide having less than about 30% (by dryweight) chemical precursors or other chemicals, less than about 20%chemical precursors or other chemicals, less than about 10% chemicalprecursors or other chemicals, or less than about 5% chemical precursorsor other chemicals.

In one embodiment, the ubiquitin protease polypeptide comprises theamino acid sequence shown in SEQ ID NO:15. However, the invention alsoencompasses sequence variants. Variants include a substantiallyhomologous protein encoded by the same genetic locus in an organism,i.e., an allelic variant.

Variants also encompass proteins derived from other genetic loci in anorganism, but having substantial homology to the ubiquitin protease ofSEQ ID NO:15. Variants also include proteins substantially homologous tothe ubiquitin protease but derived from another organism, i.e., anortholog. Variants also include proteins that are substantiallyhomologous to the ubiquitin protease that are produced by chemicalsynthesis. Variants also include proteins that are substantiallyhomologous to the ubiquitin protease that are produced by recombinantmethods. It is understood, however, that variants exclude any amino acidsequences disclosed prior to the invention.

As used herein, two proteins (or a region of the proteins) aresubstantially homologous when the amino acid sequences are at leastabout 70-75%, typically at least about 80-85%, and most typically atleast about 90-95% or more homologous. A substantially homologous aminoacid sequence, according to the present invention, will be encoded by anucleic acid sequence hybridizing to the nucleic acid sequence, orportion thereof, of the sequence shown in SEQ ID NO:16 under stringentconditions as more fully described below.

To determine the percent identity of two amino acid sequences or of twonucleic acid sequences, the sequences are aligned for optimal comparisonpurposes (e.g., gaps can be introduced in one or both of a first and asecond amino acid or nucleic acid sequence for optimal alignment andnon-homologous sequences can be disregarded for comparison purposes). Ina preferred embodiment, the length of a reference sequence aligned forcomparison purposes is at least 30%, preferably at least 40%, morepreferably at least 50%, even more preferably at least 60%, and evenmore preferably at least 70%, 80%, or 90% of the length of the referencesequence (i.e., 100%=the entire coding sequence). The amino acidresidues or nucleotides at corresponding amino acid positions ornucleotide positions are then compared. When a position in the firstsequence is occupied by the same amino acid residue or nucleotide as thecorresponding position in the second sequence, then the molecules areidentical at that position (as used herein amino acid or nucleic acid“identity” is equivalent to amino acid or nucleic acid “homology”). Thepercent identity between the two sequences is a function of the numberof identical positions shared by the sequences, taking into account thenumber of gaps, and the length of each gap, which need to be introducedfor optimal alignment of the two sequences.

The invention also encompasses polypeptides having a lower degree ofidentity but having sufficient similarity so as to perform one or moreof the same functions performed by the ubiquitin protease. Similarity isdetermined by conserved amino acid substitution. Such substitutions arethose that substitute a given amino acid in a polypeptide by anotheramino acid of like characteristics. Conservative substitutions arelikely to be phenotypically silent. Typically seen as conservativesubstitutions are the replacements, one for another, among the aliphaticamino acids Ala, Val, Leu, and Ile; interchange of the hydroxyl residuesSer and Thr, exchange of the acidic residues Asp and Glu, substitutionbetween the amide residues Asn and Gln, exchange of the basic residuesLys and Arg and replacements among the aromatic residues Phe, Tyr.Guidance concerning which amino acid changes are likely to bephenotypically silent are found in Bowie et al., Science 247:1306-1310(1990).

TABLE 1 Conservative Amino Acid Substitutions. Aromatic PhenylalanineTryptophan Tyrosine Hydrophobic Leucine Isoleucine Valine PolarGlutamine Asparagine Basic Arginine Lysine Histidine Acidic AsparticAcid Glutamic Acid Small□ Alanine Serine Threonine Methionine Glycine

The comparison of sequences and determination of percent identity andsimilarity between two sequences can be accomplished using amathematical algorithm. (Computational Molecular Biology, Lesk, A. M.,ed., Oxford University Press, New York, 1988; Biocomputing: Informaticsand Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993;Computer Analysis of Sequence Data, Part 1, Griffin, A. M., and Griffin,H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis inMolecular Biology, von Heinje, G., Academic Press, 1987; and SequenceAnalysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press,New York, 1991).

A preferred, non-limiting example of such a mathematical algorithm isdescribed in Karlin et al. (1993) Proc. Natl. Acad. Sci. USA90:5873-5877. Such an algorithm is incorporated into the NBLAST andXBLAST programs (version 2.0) as described in Altschul et al. (1997)Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLASTprograms, the default parameters of the respective programs (e.g.,NBLAST) can be used. In one embodiment, parameters for sequencecomparison can be set at score=100, wordlength=12, or can be varied(e.g., W=5 or W=20).

In a preferred embodiment, the percent identity between two amino acidsequences is determined using the Needleman et al. (1970) (J. Mol. Biol.48:444-453) algorithm which has been incorporated into the GAP programin the GCG software package, using either a BLOSUM 62 matrix or a PAM250matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a lengthweight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment, thepercent identity between two nucleotide sequences is determined usingthe GAP program in the GCG software package (Devereux et al. (1984)Nucleic Acids Res. 12(1):387), using a NWSgapdna.CMP matrix and a gapweight of 40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or6.

Another preferred, non-limiting example of a mathematical algorithmutilized for the comparison of sequences is the algorithm of Myers andMiller, CABIOS (1989). Such an algorithm is incorporated into the ALIGNprogram (version 2.0) which is part of the CGC sequence alignmentsoftware package. When utilizing the ALIGN program for comparing aminoacid sequences, a PAM120 weight residue table, a gap length penalty of12, and a gap penalty of 4 can be used. Additional algorithms forsequence analysis are known in the art and include ADVANCE and ADAM asdescribed in Torellis et al. (1994) Comput. Appl. Biosci. 10:3-5; andFASTA described in Pearson et al. (1988) PNAS 85:2444-8.

A variant polypeptide can differ in amino acid sequence by one or moresubstitutions, deletions, insertions, inversions, fusions, andtruncations or a combination of any of these.

Variant polypeptides can be fully functional or can lack function in oneor more activities. Thus, in the present case, variations can affect thefunction, for example, of ubiquitin binding, ubiquitin recognition,interaction with ubiquitinated substrate protein, such as binding orproteolysis, subunit interaction, particularly within the proteasome,activation or binding by ATP, developmental expression, temporalexpression, tissue-specific expression, interacting with cellularcomponents, such as transcriptional regulatory factors, and particularlytrans-acting transcriptional regulatory factors, proteolytic cleavage ofpeptide bonds in polyubiquitin and peptide bonds between ubiquitin orpolyubiquitin and substrate protein, and proteolytic cleavage of peptidebonds between ubiquitin or polyubiquitin and a peptide or amino acid.

Fully functional variants typically contain only conservative variationor variation in non-critical residues or in non-critical regions.Functional variants can also contain substitution of similar aminoacids, which results in no change or an insignificant change infunction. Alternatively, such substitutions may positively or negativelyaffect function to some degree.

Non-functional variants typically contain one or more non-conservativeamino acid substitutions, deletions, insertions, inversions, ortruncation or a substitution, insertion, inversion, or deletion in acritical residue or critical region.

As indicated, variants can be naturally-occurring or can be made byrecombinant means or chemical synthesis to provide useful and novelcharacteristics for the ubiquitin protease polypeptide. This includespreventing immunogenicity from pharmaceutical formulations by preventingprotein aggregation.

Useful variations further include alteration of catalytic activity. Forexample, one embodiment involves a variation at the binding site thatresults in binding but not hydrolysis, or slower hydrolysis, of thepeptide bond. A further useful variation results in an increased rate ofhydrolysis of the peptide bond. A further useful variation at the samesite can result in higher or lower affinity for substrate. Usefulvariations also include changes that provide for affinity for adifferent ubiquitinated substrate protein than that normally recognized.Other useful variations involving altered recognition affect recognitionof the type of substrate normally recognized. For example, one variationcould result in recognition of ubiquitinated intact substrate but not ofsubstrate remnants, such as ubiquitinated amino acid or peptide that areproteolysis products that result from the hydrolysis of the intactubiquitinated substrate. Alternatively, the protease could be varied sothat one or more of the remnant products is recognized but not theintact protein substrate. Another variation would affect the ability ofthe protease to rescue a ubiquitinated protein. Thus, protein substratesthat are normally rescued from proteolysis would be subject todegradation. Further useful variations affect the ability of theprotease to be induced by activators, such as cytokines, including butnot limited to, those disclosed herein. Another useful variation wouldaffect the recognition of ubiquitin substrate so that the enzyme couldnot recognize one or more of a linear polyubiquitin, branched chainpolyubiquitin, linear polyubiquitinated substrate, or branched chainpolyubiquitin substrate. Specific variations include truncation inwhich, for example, a HIS domain is deleted, the variation resulting indecrease or loss of deubiquitination activity. Another useful variationincludes one that prevents activation by ATP. Another useful variationprovides a fusion protein in which one or more domains or subregions areoperationally fused to one or more domains or subregions from anotherUBP or from a UCH. Specifically, a domain or subregion can be introducedthat provides a rescue function to an enzyme not normally having thisfunction or for recognition of a specific substrate wherein recognitionis not available to the original enzyme. Other variations include thosethat affect ubiquitin recognition or recognition of a ubiquitinatedsubstrate protein. Further variations could affect specific subunitinteraction, particularly in the proteasome. Other variations wouldaffect developmental, temporal, or tissue-specific expression. Othervariations would affect the interaction with cellular components, suchas transcriptional regulatory factors.

Amino acids that are essential for function can be identified by methodsknown in the art, such as site-directed mutagenesis or alanine-scanningmutagenesis (Cunningham et al. (1985) Science 244:1081-1085). The latterprocedure introduces single alanine mutations at every residue in themolecule. The resulting mutant molecules are then tested for biologicalactivity, such as peptide hydrolysis in vitro or ubiquitin-dependent invitro activity, such as proliferative activity, receptor-mediated signaltransduction, and other cellular processes including, but not limited,those disclosed herein that are a function of the ubiquitin system.Sites that are critical for binding or recognition can also bedetermined by structural analysis such as crystallization, nuclearmagnetic resonance or photoaffinity labeling (Smith et al. (1992) J.Mol. Biol. 224:899-904; de Vos et al. (1992) Science 255:306-312).

The assays for deubiquitinating enzyme activity are well known in theart and can be found, for example, in Zhu et al. (1997) Journal ofBiological Chemistry 272:51-57, Mitch et al. (1999) American Journal ofPhysiology 276:C1132-C1138, Liu et al. (1999) Molecular and Cell Biology19:3029-3038, and such as those cited in various reviews, for example,Ciechanover et al. (1994) The FASEB Journal 8:182-192, Chiechanover(1994) Biol. Chem. Hoppe-Seyler 375:565-581, Hershko et al. (1998)Annual Review of Biochemistry 67:425-479, Swartz (1999) Annual Review ofMedicine 50:57-74, Ciechanover (1998) EMBO Journal 17:7151-7160, andD'Andrea et al. (1998) Critical Reviews in Biochemistry and MolecularBiology 33:337-352. These assays include, but are not limited to, thedisappearance of substrate, including decrease in the amount ofpolyubiquitin or ubiquitinated substrate protein or protein remnant,appearance of intermediate and end products, such as appearance of freeubiquitin monomers, general protein turnover, specific protein turnover,ubiquitin binding, binding to ubiquitinated substrate protein, subunitinteraction, interaction with ATP, interaction with cellular componentssuch as trans-acting regulatory factors, stabilization of specificproteins, and the like.

Substantial homology can be to the entire nucleic acid or amino acidsequence or to fragments of these sequences.

The invention thus also includes polypeptide fragments of the ubiquitinprotease. Fragments can be derived from the amino acid sequence shown inSEQ ID NO:15. However, the invention also encompasses fragments of thevariants of the ubiquitin proteases as described herein.

The fragments to which the invention pertains, however, are not to beconstrued as encompassing fragments that may be disclosed prior to thepresent invention.

Accordingly, a fragment can comprise at least about 11, 15, 20, 25, 30,35, 40, 45, 50 or more contiguous amino acids. Fragments can retain oneor more of the biological activities of the protein, for example theability to bind to ubiquitin or hydrolyze peptide bonds, as well asfragments that can be used as an immunogen to generate ubiquitinprotease antibodies.

Biologically active fragments (peptides which are, for example, 5, 7,10, 12, 15, 20, 30, 35, 36, 37, 38, 39, 40, 50, 100 or more amino acidsin length) can comprise a domain or motif, e.g., catalytic site, UCHfamily 2 signature, signature for the immunoglobulin and majorhistocompatibility complex proteins, and sites for glycosylation, cAMPand cGMP-dependent protein kinase phosphorylation, protein kinase Cphosphorylation, casein kinase II phosphorylation, tyrosine kinasephosphorylation, N-myristoylation, and amidation. Further possiblefragments include the catalytic site or domain including conserved aminoacid residues found in UBP and UCH thiol proteases. Such regionsinclude, for example, about amino acids 123 to 138 of SEQ ID NO:15 orthe UCH2 family 2 signature found from about amino acids 365 to about383 of SEQ ID NO:15. Additional domains include ubiquitin recognitionsites, ubiquitin binding sites, sites important for subunit interaction,and sites important for carrying out the other functions of the proteaseas described herein.

Such domains or motifs can be identified by means of routinecomputerized homology searching procedures.

Fragments, for example, can extend in one or both directions from thefunctional site to encompass 5, 10, 15, 20, 30, 40, 50, or up to 100amino acids. Further, fragments can include sub-fragments of thespecific domains mentioned above, which sub-fragments retain thefunction of the domain from which they are derived.

These regions can be identified by well-known methods involvingcomputerized homology analysis.

The invention also provides fragments with immunogenic properties. Thesecontain an epitope-bearing portion of the ubiquitin protease andvariants. These epitope-bearing peptides are useful to raise antibodiesthat bind specifically to a ubiquitin protease polypeptide or region orfragment. These peptides can contain at least 11, 12, at least 14, orbetween at least about 15 to about 30 amino acids.

Non-limiting examples of antigenic polypeptides that can be used togenerate antibodies include but are not limited to peptides derived froman extracellular site. Regions having a high antigenicity index areshown in FIG. 34. However, intracellularly-made antibodies(“intrabodies”) are also encompassed, which would recognizeintracellular peptide regions.

The epitope-bearing ubiquitin protease polypeptides may be produced byany conventional means (Houghten, R. A. (1985) Proc. Natl. Acad. Sci.USA 82:5131-5135). Simultaneous multiple peptide synthesis is describedin U.S. Pat. No. 4,631,211.

Fragments can be discrete (not fused to other amino acids orpolypeptides) or can be within a larger polypeptide. Further, severalfragments can be comprised within a single larger polypeptide. In oneembodiment a fragment designed for expression in a host can haveheterologous pre- and pro-polypeptide regions fused to the aminoterminus of the ubiquitin protease fragment and an additional regionfused to the carboxyl terminus of the fragment.

The invention thus provides chimeric or fusion proteins. These comprisea ubiquitin protease peptide sequence operatively linked to aheterologous peptide having an amino acid sequence not substantiallyhomologous to the ubiquitin protease. “Operatively linked” indicatesthat the ubiquitin protease peptide and the heterologous peptide arefused in-frame. The heterologous peptide can be fused to the N-terminusor C-terminus of the ubiquitin protease or can be internally located.

In one embodiment the fusion protein does not affect ubiquitin proteasefunction per se. For example, the fusion protein can be a GST-fusionprotein in which the ubiquitin protease sequences are fused to theC-terminus of the GST sequences. Other types of fusion proteins include,but are not limited to, enzymatic fusion proteins, for examplebeta-galactosidase fusions, yeast two-hybrid GAL-4 fusions, poly-Hisfusions and Ig fusions. Such fusion proteins, particularly poly-Hisfusions, can facilitate the purification of recombinant ubiquitinprotease. In certain host cells (e.g., mammalian host cells), expressionand/or secretion of a protein can be increased by using a heterologoussignal sequence. Therefore, in another embodiment, the fusion proteincontains a heterologous signal sequence at its N-terminus.

EP-A-O 464 533 discloses fusion proteins comprising various portions ofimmunoglobulin constant regions. The Fc is useful in therapy anddiagnosis and thus results, for example, in improved pharmacokineticproperties (EP-A 0232 262). In drug discovery, for example, humanproteins have been fused with Fc portions for the purpose ofhigh-throughput screening assays to identify antagonists (Bennett et al.(1995) J. Mol. Recog. 8:52-58 (1995) and Johanson et al. J. Biol. Chem.270:9459-9471). Thus, this invention also encompasses soluble fusionproteins containing a ubiquitin protease polypeptide and variousportions of the constant regions of heavy or light chains ofimmunoglobulins of various subclass (IgG, IgM, IgA, IgE). Preferred asimmunoglobulin is the constant part of the heavy chain of human IgG,particularly IgG1, where fusion takes place at the hinge region. Forsome uses it is desirable to remove the Fc after the fusion protein hasbeen used for its intended purpose, for example when the fusion proteinis to be used as antigen for immunizations. In a particular embodiment,the Fc part can be removed in a simple way by a cleavage sequence, whichis also incorporated and can be cleaved with factor Xa.

A chimeric or fusion protein can be produced by standard recombinant DNAtechniques. For example, DNA fragments coding for the different proteinsequences are ligated together in-frame in accordance with conventionaltechniques. In another embodiment, the fusion gene can be synthesized byconventional techniques including automated DNA synthesizers.Alternatively, PCR amplification of gene fragments can be carried outusing anchor primers which give rise to complementary overhangs betweentwo consecutive gene fragments which can subsequently be annealed andre-amplified to generate a chimeric gene sequence (see Ausubel et al.(1992) Current Protocols in Molecular Biology). Moreover, manyexpression vectors are commercially available that already encode afusion moiety (e.g., a GST protein). A ubiquitin protease-encodingnucleic acid can be cloned into such an expression vector such that thefusion moiety is linked in-frame to the ubiquitin protease.

Another form of fusion protein is one that directly affects ubiquitinprotease functions. Accordingly, a ubiquitin protease polypeptide isencompassed by the present invention in which one or more of theubiquitin protease domains (or parts thereof) has been replaced byhomologous domains (or parts thereof) from another UBP or UCH species.Accordingly, various permutations are possible. One or more functionalsites as disclosed herein from the specifically disclosed protease canbe replaced by one or more functional sites from a corresponding UBPfamily member or from a UCH family member. Thus, chimeric ubiquitinproteases can be formed in which one or more of the native domains orsubregions has been replaced by another.

Additionally, chimeric ubiquitin protease proteins can be produced inwhich one or more functional sites is derived from a different ubiquitinprotease family. It is understood however that sites could be derivedfrom ubiquitin protease families that occur in the mammalian genome butwhich have not yet been discovered or characterized. Such sites includebut are not limited to any of the functional sites disclosed herein.

The isolated ubiquitin proteases can be purified from any of the cellsthat naturally express it, such as, fetal kidney, testes, fetal liver,ovary, fetal heart, kidney, thyroid, undifferentiated osteoblasts,skeletal muscle, malignant breast tissue, primary lung tumors and livermetastases derived from colon. Alternatively, the ubiquitin protease maybe purified from cells that have been altered to express it(recombinant), or synthesized using known protein synthesis methods.

In one embodiment, the protein is produced by recombinant DNAtechniques. For example, a nucleic acid molecule encoding the ubiquitinprotease polypeptide is cloned into an expression vector, the expressionvector introduced into a host cell and the protein expressed in the hostcell. The protein can then be isolated from the cells by an appropriatepurification scheme using standard protein purification techniques.Polypeptides often contain amino acids other than the 20 amino acidscommonly referred to as the 20 naturally-occurring amino acids. Further,many amino acids, including the terminal amino acids, may be modified bynatural processes, such as processing and other post-translationalmodifications, or by chemical modification techniques well known in theart. Common modifications that occur naturally in polypeptides aredescribed in basic texts, detailed monographs, and the researchliterature, and they are well known to those of skill in the art.

Accordingly, the polypeptides also encompass derivatives or analogs inwhich a substituted amino acid residue is not one encoded by the geneticcode, in which a substituent group is included, in which the maturepolypeptide is fused with another compound, such as a compound toincrease the half-life of the polypeptide (for example, polyethyleneglycol), or in which the additional amino acids are fused to the maturepolypeptide, such as a leader or secretory sequence or a sequence forpurification of the mature polypeptide or a pro-protein sequence.

Known modifications include, but are not limited to, acetylation,acylation, ADP-ribosylation, amidation, covalent attachment of flavin,covalent attachment of a heme moiety, covalent attachment of anucleotide or nucleotide derivative, covalent attachment of a lipid orlipid derivative, covalent attachment of phosphatidylinositol,cross-linking, cyclization, disulfide bond formation, demethylation,formation of covalent crosslinks, formation of cystine, formation ofpyroglutamate, formylation, gamma carboxylation, glycosylation, GPIanchor formation, hydroxylation, iodination, methylation,myristoylation, oxidation, proteolytic processing, phosphorylation,prenylation, racemization, selenoylation, sulfation, transfer-RNAmediated addition of amino acids to proteins such as arginylation, andubiquitination.

Such modifications are well known to those of skill in the art and havebeen described in great detail in the scientific literature. Severalparticularly common modifications, glycosylation, lipid attachment,sulfation, gamma-carboxylation of glutamic acid residues, hydroxylationand ADP-ribosylation, for instance, are described in most basic texts,such as Proteins—Structure and Molecular Properties, 2nd ed., T.E.Creighton, W.H. Freeman and Company, New York (1993). Many detailedreviews are available on this subject, such as by Wold, F.,Posttranslational Covalent Modification of Proteins, B. C. Johnson, Ed.,Academic Press, New York 1-12 (1983); Seifter et al. (1990) Meth.Enzymol. 182: 626-646) and Rattan et al. (1992) Ann. N.Y. Acad. Sci.663:48-62).

As is also well known, polypeptides are not always entirely linear. Forinstance, polypeptides may be branched as a result of ubiquitination,and they may be circular, with or without branching, generally as aresult of post-translation events, including natural processing eventsand events brought about by human manipulation which do not occurnaturally. Circular, branched and branched circular polypeptides may besynthesized by non-translational natural processes and by syntheticmethods.

Modifications can occur anywhere in a polypeptide, including the peptidebackbone, the amino acid side-chains and the amino or carboxyl termini.Blockage of the amino or carboxyl group in a polypeptide, or both, by acovalent modification, is common in naturally-occurring and syntheticpolypeptides. For instance, the aminoterminal residue of polypeptidesmade in E. coli, prior to proteolytic processing, almost invariably willbe N-formylmethionine.

The modifications can be a function of how the protein is made. Forrecombinant polypeptides, for example, the modifications will bedetermined by the host cell posttranslational modification capacity andthe modification signals in the polypeptide amino acid sequence.Accordingly, when glycosylation is desired, a polypeptide should beexpressed in a glycosylating host, generally a eukaryotic cell. Insectcells often carry out the same posttranslational glycosylations asmammalian cells and, for this reason, insect cell expression systemshave been developed to efficiently express mammalian proteins havingnative patterns of glycosylation. Similar considerations apply to othermodifications.

The same type of modification may be present in the same or varyingdegree at several sites in a given polypeptide. Also, a givenpolypeptide may contain more than one type of modification.

Polypeptide Uses

The protein sequences of the present invention can be used as a “querysequence” to perform a search against public databases to, for example,identify other family members or related sequences. Such searches can beperformed using the NBLAST and XBLAST programs (version 2.0) of Altschulet al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can beperformed with the NBLAST program, score=100, wordlength=12 to obtainnucleotide sequences homologous to the nucleic acid molecules of theinvention. BLAST protein searches can be performed with the XBLASTprogram, score=50, wordlength=3 to obtain amino acid sequenceshomologous to the proteins of the invention. To obtain gapped alignmentsfor comparison purposes, Gapped BLAST can be utilized as described inAltschul et al., (1997) Nucleic Acids Res. 25(17):3389-3402. Whenutilizing BLAST and Gapped BLAST programs, the default parameters of therespective programs (e.g., XBLAST and NBLAST) can be used.

The ubiquitin protease polypeptides are useful for producing antibodiesspecific for the ubiquitin protease, regions, or fragments. Regionshaving a high antigenicity index score are shown in FIG. 34.

The ubiquitin protease polypeptides are useful for biological assaysrelated to ubiquitin protease function. Such assays involve any of theknown functions or activities or properties useful for diagnosis andtreatment of ubiquitin- or ubiquitin protease-related conditions orconditions in which expression of the protease is relevant, such as inviral infections. Potential assays have been disclosed herein andgenerically include disappearance of substrate, appearance of endproduct, and general or specific protein turnover.

The ubiquitin protease polypeptides are also useful in drug screeningassays, in cell-based or cell-free systems. Cell-based systems can benative, i.e., cells that normally express the ubiquitin protease, as abiopsy or expanded in cell culture. In one embodiment, however,cell-based assays involve recombinant host cells expressing theubiquitin protease.

Determining the ability of the test compound to interact with theubiquitin protease can also comprise determining the ability of the testcompound to preferentially bind to the polypeptide as compared to theability of a known binding molecule (e.g., ubiquitin) to bind to thepolypeptide.

The polypeptides can be used to identify compounds that modulateubiquitin protease activity. Such compounds, for example, can increaseor decrease affinity for polyubiquitin, either linear or branched chain,ubiquitinated protein substrate, or ubiquitinated protein substrateremnants. Such compounds could also, for example, increase or decreasethe rate of binding to these components. Such compounds could alsocompete with these components for binding to the ubiquitin protease ordisplace these components bound to the ubiquitin protease. Suchcompounds could also affect interaction with other components, such asATP, other subunits, for example, in the 19S complex, andtranscriptional regulatory factors. It is understood, therefore, thatsuch compounds can be identified not only by means of ubiquitin, but bymeans of any of the components that functionally interact with thedisclosed protease. This includes, but is not limited to, any of thosecomponents disclosed herein.

Both ubiquitin protease and appropriate variants and fragments can beused in high-throughput screens to assay candidate compounds for theability to bind to the ubiquitin protease. These compounds can befurther screened against a functional ubiquitin protease to determinethe effect of the compound on the ubiquitin protease activity. Compoundscan be identified that activate (agonist) or inactivate (antagonist) theubiquitin protease to a desired degree. Modulatory methods can beperformed in vitro (e.g., by culturing the cell with the agent) or,alternatively, in vivo (e.g., by administering the agent to a subject.

The ubiquitin protease polypeptides can be used to screen a compound forthe ability to stimulate or inhibit interaction between the ubiquitinprotease protein and a target molecule that normally interacts with theubiquitin protease protein. The target can be ubiquitin, ubiquitinatedsubstrate, or polyubiquitin or another component of the pathway withwhich the ubiquitin protease protein normally interacts (for example,ATP). The assay includes the steps of combining the ubiquitin proteaseprotein with a candidate compound under conditions that allow theubiquitin protease protein or fragment to interact with the targetmolecule, and to detect the formation of a complex between the ubiquitinprotease protein and the target or to detect the biochemical consequenceof the interaction with the ubiquitin protease and the target. Any ofthe associated effects of protease function can be assayed. Thisincludes the production of hydrolysis products, such as free terminalpeptide substrate, free terminal amino acid from the hydrolyzedsubstrate, free ubiquitin, lower molecular weight species of hydrolyzedpolyubiquitin, released intact substrate protein resulting from rescuefrom proteolysis, free polyubiquitin formed from hydrolysis of thepolyubiquitin from intact substrate, and substrate remnants, such asamino acids and peptides produced from proteolysis of the substrateprotein, and biological endpoints of the pathway.

Determining the ability of the ubiquitin protease to bind to a targetmolecule can also be accomplished using a technology such as real-timeBimolecular Interaction Analysis (BIA). Sjolander et al. (1991) Anal.Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol.5:699-705. As used herein, “BIA” is a technology for studyingbiospecific interactions in real time, without labeling any of theinteractants (e.g., BIAcore™). Changes in the optical phenomenon surfaceplasmon resonance (SPR) can be used as an indication of real-timereactions between biological molecules.

The test compounds of the present invention can be obtained using any ofthe numerous approaches in combinatorial library methods known in theart, including: biological libraries; spatially addressable parallelsolid phase or solution phase libraries; synthetic library methodsrequiring deconvolution; the ‘one-bead one-compound’ library method; andsynthetic library methods using affinity chromatography selection. Thebiological library approach is limited to polypeptide libraries, whilethe other four approaches are applicable to polypeptide, non-peptideoligomer or small molecule libraries of compounds (Lam, K. S. (1997)Anticancer Drug Des. 12:145).

Examples of methods for the synthesis of molecular libraries can befound in the art, for example in DeWitt et al. (1993) Proc. Natl. Acad.Sci. USA 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422;Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993)Science 261:1303; Carell et al. (1994) Angew. Chem. Int. Ed. Engl.33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; andin Gallop et al. (1994) J. Med. Chem. 37:1233. Libraries of compoundsmay be presented in solution (e.g., Houghten (1992) Biotechniques13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor(1993) Nature 364:555-556), bacteria (Ladner U.S. Pat. No. 5,223,409),spores (Ladner U.S. Pat. No. '409), plasmids (Cull et al. (1992) Proc.Natl. Acad. Sci. USA 89:1865-1869) or on phage (Scott and Smith (1990)Science 249:386-390); (Devlin (1990) Science 249:404-406); (Cwirla etal. (1990) Proc. Natl. Acad. Sci. 97:6378-6382); (Felici (1991) J. Mol.Biol. 222:301-310); (Ladner supra).

Candidate compounds include, for example, 1) peptides such as solublepeptides, including Ig-tailed fusion peptides and members of randompeptide libraries (see, e.g., Lam et al. (1991) Nature 354:82-84;Houghten et al. (1991) Nature 354:84-86) and combinatorialchemistry-derived molecular libraries made of D- and/or L-configurationamino acids; 2) phosphopeptides (e.g., members of random and partiallydegenerate, directed phosphopeptide libraries, see, e.g., Songyang etal. (1993) Cell 72:767-778); 3) antibodies (e.g., polyclonal,monoclonal, humanized, anti-idiotypic, chimeric, and single chainantibodies as well as Fab, F(ab′)₂, Fab expression library fragments,and epitope-binding fragments of antibodies); and 4) small organic andinorganic molecules (e.g., molecules obtained from combinatorial andnatural product libraries).

One candidate compound is a soluble full-length ubiquitin protease orfragment that competes for substrate binding. Other candidate compoundsinclude mutant ubiquitin proteases or appropriate fragments containingmutations that affect ubiquitin protease function and compete forsubstrate. Accordingly, a fragment that competes for substrate, forexample with a higher affinity, or a fragment that binds substrate butdoes not hydrolyze the peptide bond, is encompassed by the invention.

Other candidate compounds include ubiquitinated protein or proteinanalog that binds to the protease but is not released or releasedslowly. Other candidate compounds include analogs of the other naturalsubstrates, such as substrate remnants that bind to but are not releasedor released more slowly. Further candidate compounds include activatorsof the proteases such as cytokines, including but not limited to, thosedisclosed herein.

The invention provides other end points to identify compounds thatmodulate (stimulate or inhibit) ubiquitin protease activity. The assaystypically involve an assay of events in the pathway that indicateubiquitin protease activity. This can include cellular events thatresult from deubiquitination, such as cell cycle progression, programmedcell death, growth factor-mediated signal transduction, or any of thecellular processes including, but not limited to, those disclosed hereinas resulting from deubiquitination. Specific phenotypes include changesin stress response, DNA replication, receptor internalization, cellulartransformation or reversal of transformation, and transcriptionalsilencing.

Assays are based on the multiple cellular functions of deubiquitinatingenzymes. These enzymes act at various different levels in the regulationof protein ubiquitination. A deubiquitinating enzyme can degrade alinear polyubiquitin chain into monomeric ubiquitin molecules.Deubiquitinating enzymes, such as isopeptidase-T, can degrade a branchedmultiubiquitin chain into monomeric ubiquitin molecules.Deubiquitinating enzymes can remove ubiquitin from aubiquitin-conjugated target protein. The deubiquitinating enzyme, suchas FAF or PA700 isopeptidase, can remove polyubiquitin from aubiquitinated target protein, and thereby rescue the target fromdegradation by the 26S proteasome. Deubiquitinating enzymes such asDoa-4 can remove polyubiquitin from proteasome degradation products. UCHfamily members tend to hydrolyze monoubiquitinated substrate (Larsen etal. (1998) Biochemistry 10:3358-68). The UCH deubiquitinating enzymeAP-UCH enhances proteolytic activity of Protein Kinase A (PKA) throughthe ubiquitin-proteosome pathway. Furthermore, BAP1 has been identifiedas a new member of the UCH family and interacts with BRAC1, therebyenhancing BRCA1 mediated cell growth suppression (Jensen et al. (1998)Oncogene 16: 1097-1112). The end result of all of the deubiquitinatingenzymes is to regulate the cellular pool of free monomeric ubiquitin.Accordingly, assays can be based on detection of any of the productsproduced by hydrolysis/deubiquitination.

Further, the expression of genes that are up- or down-regulated byaction of the ubiquitin protease can be assayed. In one embodiment, theregulatory region of such genes can be operably linked to a marker thatis easily detectable, such as luciferase.

Accordingly, any of the biological or biochemical functions mediated bythe ubiquitin protease can be used as an endpoint assay. These includeall of the biochemical or biochemical/biological events describedherein, in the references cited herein, incorporated by reference forthese endpoint assay targets, and other functions known to those ofordinary skill in the art.

Binding and/or activating compounds can also be screened by usingchimeric ubiquitin protease proteins in which one or more domains,sites, and the like, as disclosed herein, or parts thereof, can bereplaced by their heterologous counterparts derived from other ubiquitinproteases. For example, a recognition or binding region can be used thatinteracts with different substrate specificity and/or affinity than thenative ubiquitin protease. Accordingly, a different set of pathwaycomponents is available as an end-point assay for activation. Further,sites that are responsible for developmental, temporal, or tissuespecificity can be replaced by heterologous sites such that the proteasecan be detected under conditions of specific developmental, temporal, ortissue-specific expression.

The ubiquitin protease polypeptides are also useful in competitionbinding assays in methods designed to discover compounds that interactwith the ubiquitin protease. Thus, a compound is exposed to a ubiquitinprotease polypeptide under conditions that allow the compound to bind toor to otherwise interact with the polypeptide. Soluble ubiquitinprotease polypeptide is also added to the mixture. If the test compoundinteracts with the soluble ubiquitin protease polypeptide, it decreasesthe amount of complex formed or activity from the ubiquitin proteasetarget. This type of assay is particularly useful in cases in whichcompounds are sought that interact with specific regions of theubiquitin protease. Thus, the soluble polypeptide that competes with thetarget ubiquitin protease region is designed to contain peptidesequences corresponding to the region of interest.

Another type of competition-binding assay can be used to discovercompounds that interact with specific functional sites. As an example,ubiquitin and a candidate compound can be added to a sample of theubiquitin protease. Compounds that interact with the ubiquitin proteaseat the same site as ubiquitin will reduce the amount of complex formedbetween the ubiquitin protease and ubiquitin. Accordingly, it ispossible to discover a compound that specifically prevents interactionbetween the ubiquitin protease and ubiquitin. Another example involvesadding a candidate compound to a sample of ubiquitin protease andpolyubiquitin. A compound that competes with polyubiquitin will reducethe amount of hydrolysis or binding of the polyubiquitin to theubiquitin protease. Accordingly, compounds can be discovered thatdirectly interact with the ubiquitin protease and compete withpolyubiquitin. Such assays can involve any other component thatinteracts with the ubiquitin protease, such as ubiquitinated substrateprotein, ubiquitinated substrate remnants, and cellular components withwhich the protease interacts such as transcriptional regulatory factors.

To perform cell free drug screening assays, it is desirable toimmobilize either the ubiquitin protease, or fragment, or its targetmolecule to facilitate separation of complexes from uncomplexed forms ofone or both of the proteins, as well as to accommodate automation of theassay.

Techniques for immobilizing proteins on matrices can be used in the drugscreening assays. In one embodiment, a fusion protein can be providedwhich adds a domain that allows the protein to be bound to a matrix. Forexample, glutathione-S-transferase/ubiquitin protease fusion proteinscan be adsorbed onto glutathione sepharose beads (Sigma Chemical, St.Louis, Mo.) or glutathione derivatized microtitre plates, which are thencombined with the cell lysates (e.g., ³⁵S-labeled) and the candidatecompound, and the mixture incubated under conditions conducive tocomplex formation (e.g., at physiological conditions for salt and pH).Following incubation, the beads are washed to remove any unbound label,and the matrix immobilized and radiolabel determined directly, or in thesupernatant after the complexes is dissociated. Alternatively, thecomplexes can be dissociated from the matrix, separated by SDS-PAGE, andthe level of ubiquitin protease-binding protein found in the beadfraction quantitated from the gel using standard electrophoretictechniques. For example, either the polypeptide or its target moleculecan be immobilized utilizing conjugation of biotin and streptavidinusing techniques well known in the art. Alternatively, antibodiesreactive with the protein but which do not interfere with binding of theprotein to its target molecule can be derivatized to the wells of theplate, and the protein trapped in the wells by antibody conjugation.Preparations of a ubiquitin protease-binding target component, such asubiquitin, polyubiquitin, ubiquitinated substrate protein, ubiquitinatedsubstrate protein remnant, or ubiquitinated remnant amino acid, and acandidate compound are incubated in the ubiquitin protease-presentingwells and the amount of complex trapped in the well can be quantitated.Methods for detecting such complexes, in addition to those describedabove for the GST-immobilized complexes, include immunodetection ofcomplexes using antibodies reactive with the ubiquitin protease targetmolecule, or which are reactive with ubiquitin protease and compete withthe target molecule; as well as enzyme-linked assays which rely ondetecting an enzymatic activity associated with the target molecule.

Modulators of ubiquitin protease activity identified according to thesedrug screening assays can be used to treat a subject with a disordermediated or affected by the ubiquitin protease pathway, by treatingcells that express the ubiquitin protease or cells in which proteaseexpression is desirable (such as virus-infected cells). Such cellsinclude, for example, fetal kidney, testes, fetal liver, ovary, fetalheart, kidney, thyroid, undifferentiated osteoblasts, skeletal muscle,and malignant breast, lung and colon tissue, as well as liver metastasesderived from malignant colonic tissue. These methods of treatmentinclude the steps of administering the modulators of ubiquitin proteaseactivity in a pharmaceutical composition as described herein, to asubject in need of such treatment.

Tissues and/or cells in which the ubiquitin protease is expressedinclude, but are not limited to those shown in FIGS. 37 and 38. Tissuesin which the gene is highly expressed include fetal kidney, testes,fetal liver, ovary, and fetal heart. Expression is also seen in thekidney, thyroid, undifferentiated osteoblasts and skeletal muscle. Theubiquitin protease is also expressed in normal liver and in normal andmalignant breast, lung, and colon tissue and in liver metastases derivedfrom malignant colonic tissues. Hence, the ubiquitin protease isrelevant to treating disorders involving these tissues, breast, lung,colon carcinoma, and colon metastases to liver.

Disorders involving the liver include, but are not limited to, hepaticinjury; jaundice and cholestasis, such as bilirubin and bile formation;hepatic failure and cirrhosis, such as cirrhosis, portal hypertension,including ascites, portosystemic shunts, and splenomegaly; infectiousdisorders, such as viral hepatitis, including hepatitis A-E infectionand infection by other hepatitis viruses, clinicopathologic syndromes,such as the carrier state, asymptomatic infection, acute viralhepatitis, chronic viral hepatitis, and fulminant hepatitis; autoimmunehepatitis; drug- and toxin-induced liver disease, such as alcoholicliver disease; inborn errors of metabolism and pediatric liver disease,such as hemochromatosis, Wilson disease, α₁-antitrypsin deficiency, andneonatal hepatitis; intrahepatic biliary tract disease, such assecondary biliary cirrhosis, primary biliary cirrhosis, primarysclerosing cholangitis, and anomalies of the biliary tree; circulatorydisorders, such as impaired blood flow into the liver, including hepaticartery compromise and portal vein obstruction and thrombosis, impairedblood flow through the liver, including passive congestion andcentrilobular necrosis and peliosis hepatis, hepatic vein outflowobstruction, including hepatic vein thrombosis (Budd-Chiari syndrome)and veno-occlusive disease; hepatic disease associated with pregnancy,such as preeclampsia and eclampsia, acute fatty liver of pregnancy, andintrehepatic cholestasis of pregnancy; hepatic complications of organ orbone marrow transplantation, such as drug toxicity after bone marrowtransplantation, graft-versus-host disease and liver rejection, andnonimmunologic damage to liver allografts; tumors and tumorousconditions, such as nodular hyperplasias, adenomas, and malignanttumors, including primary carcinoma of the liver and metastatic tumors.

Disorders involving the heart, include but are not limited to, heartfailure, including but not limited to, cardiac hypertrophy, left-sidedheart failure, and right-sided heart failure; ischemic heart disease,including but not limited to angina pectoris, myocardial infarction,chronic ischemic heart disease, and sudden cardiac death; hypertensiveheart disease, including but not limited to, systemic (left-sided)hypertensive heart disease and pulmonary (right-sided) hypertensiveheart disease; valvular heart disease, including but not limited to,valvular degeneration caused by calcification, such as calcific aorticstenosis, calcification of a congenitally bicuspid aortic valve, andmitral annular calcification, and myxomatous degeneration of the mitralvalve (mitral valve prolapse), rheumatic fever and rheumatic heartdisease, infective endocarditis, and noninfected vegetations, such asnonbacterial thrombotic endocarditis and endocarditis of systemic lupuserythematosus (Libman-Sacks disease), carcinoid heart disease, andcomplications of artificial valves; myocardial disease, including butnot limited to dilated cardiomyopathy, hypertrophic cardiomyopathy,restrictive cardiomyopathy, and myocarditis; pericardial disease,including but not limited to, pericardial effusion and hemopericardiumand pericarditis, including acute pericarditis and healed pericarditis,and rheumatoid heart disease; neoplastic heart disease, including butnot limited to, primary cardiac tumors, such as myxoma, lipoma,papillary fibroelastoma, rhabdomyoma, and sarcoma, and cardiac effectsof noncardiac neoplasms; congenital heart disease, including but notlimited to, left-to-right shunts—late cyanosis, such as atrial septaldefect, ventricular septal defect, patent ductus arteriosus, andatrioventricular septal defect, right-to-left shunts—early cyanosis,such as tetralogy of fallot, transposition of great arteries, truncusarteriosus, tricuspid atresia, and total anomalous pulmonary venousconnection, obstructive congenital anomalies, such as coarctation ofaorta, pulmonary stenosis and atresia, and aortic stenosis and atresia,and disorders involving cardiac transplantation.

Disorders involving the kidney include, but are not limited to,congenital anomalies including, but not limited to, cystic diseases ofthe kidney, that include but are not limited to, cystic renal dysplasia,autosomal dominant (adult) polycystic kidney disease, autosomalrecessive (childhood) polycystic kidney disease, and cystic diseases ofrenal medulla, which include, but are not limited to, medullary spongekidney and nephronophthisis-uremic medullary cystic disease complex,acquired (dialysis-associated) cystic disease and simple cysts;glomerular diseases including pathologies of glomerular injury thatinclude, but are not limited to, in situ immune complex deposition, thatincludes, but is not limited to, anti-GBM nephritis, Heymann nephritis,and antibodies against planted antigens, circulating immune complexnephritis, antibodies to glomerular cells, cell-mediated immunity inglomerulonephritis, activation of alternative complement pathway,epithelial cell injury, and pathologies involving mediators ofglomerular injury including cellular and soluble mediators, acuteglomerulonephritis, such as acute proliferative (poststreptococcal,postinfectious) glomerulonephritis, including but not limited to,poststreptococcal glomerulonephritis and nonstreptococcal acuteglomerulonephritis, rapidly progressive (crescentic) glomerulonephritis,nephrotic syndrome, membranous glomerulonephritis (membranousnephropathy), minimal change disease (lipoid nephrosis), focal segmentalglomerulosclerosis, membranoproliferative glomerulonephritis, IgAnephropathy (Berger disease), focal proliferative and necrotizingglomerulonephritis (focal glomerulonephritis), hereditary nephritis,including but not limited to, Alport syndrome and thin membrane disease(benign familial hematuria), chronic glomerulonephritis, glomerularlesions associated with systemic disease, including but not limited to,systemic lupus crythematosus, Henoch-Schönlein purpura, bacterialendocarditis, diabetic glomerulosclerosis, amyloidosis, fibrillary andimmunotactoid glomerulonephritis, and other systemic disorders; diseasesaffecting tubules and interstitium, including, but not limited to, acutetubular necrosis and tubulointerstitial nephritis, including but notlimited to, pyelonephritis and urinary tract infection, acutepyelonephritis, chronic pyelonephritis and reflux nephropathy,tubulointerstitial nephritis induced by drugs and toxins, including butnot limited to, acute drug-induced interstitial nephritis, analgesicabuse nephropathy, and nephropathy associated with nonsteroidalanti-inflammatory drugs, and other tubulointerstitial diseasesincluding, but not limited to, urate nephropathy, hypercalcemia andnephrocalcinosis, and multiple myeloma; diseases of blood vesselsincluding, including but not limited to, benign nephrosclerosis,malignant hypertension and accelerated nephrosclerosis, renal arterystenosis, and thrombotic microangiopathies, including, but not limitedto, classic (childhood) hemolytic-uremic syndrome, adulthemolytic-uremic syndrome/thrombotic thrombocytopenic purpura,idiopathic HUS/TTP, and other vascular disorders including, but notlimited to, atherosclerotic ischemic renal disease, atheroembolic renaldisease, sickle cell disease nephropathy, diffuse cortical necrosis, andrenal infarcts; urinary tract obstruction (obstructive uropathy);urolithiasis (renal calculi, stones); and tumors of the kidneyincluding, but not limited to, benign tumors, such as renal papillaryadenoma, renal fibroma or hamartoma (renomedullary interstitial celltumor), angiomyolipoma, and oncocytoma, and malignant tumors, includingrenal cell carcinoma (hypemephroma, adenocarcinoma of kidney), whichincludes urothelial carcinomas of renal pelvis.

Disorders of the breast include, but are not limited to, disorders ofdevelopment; inflammations, including but not limited to, acutemastitis, periductal mastitis (recurrent subareolar abscess, squamousmetaplasia of lactiferous ducts), mammary duct ectasia, fat necrosis,granulomatous mastitis, and pathologies associated with silicone breastimplants; fibrocystic changes; proliferative breast disease including,but not limited to, epithelial hyperplasia, sclerosing adenosis, andsmall duct papillomas; tumors including, but not limited to, stromaltumors such as fibroadenoma, phyllodes tumor, and sarcomas, andepithelial tumors, such as large duct papilloma; carcinoma of the breastincluding in situ (noninvasive) carcinoma that includes ductal carcinomain situ (including Paget's disease) and lobular carcinoma in situ, andinvasive (infiltrating) carcinoma including, but not limited to,invasive ductal carcinoma, no special type, invasive lobular carcinoma,medullary carcinoma, colloid (mucinous) carcinoma, tubular carcinoma,and invasive papillary carcinoma, and miscellaneous malignant neoplasms.Disorders in the male breast include, but are not limited to,gynecomastia and carcinoma.

Disorders involving the testis and epididymis include, but are notlimited to, congenital anomalies such as cryptorchidism, regressivechanges such as atrophy, inflammations such as nonspecific epididymitisand orchitis, granulomatous (autoimmune) orchitis, and specificinflammations including, but not limited to, gonorrhea, mumps,tuberculosis, and syphilis, vascular disturbances including torsion,testicular tumors including germ cell tumors that include, but are notlimited to, seminoma, spermatocytic seminoma, embryonal carcinoma, yolksac tumor, choriocarcinoma, teratoma, and mixed tumors, tumors of sexcord-gonadal stroma including, but not limited to, Leydig (interstitial)cell tumors and Sertoli cell tumors (androblastoma), and testicularlymphoma, and miscellaneous lesions of tunica vaginalis.

Disorders involving the prostate include, but are not limited to,inflammations, benign enlargement, for example, nodular hyperplasia(benign prostatic hypertrophy or hyperplasia), and tumors such ascarcinoma.

Disorders involving the thyroid include, but are not limited to,hyperthyroidism; hypothyroidism including, but not limited to, cretinismand myxedema; thyroiditis including, but not limited to, hashimotothyroiditis, subacute (granulomatous) thyroiditis, and subacutelymphocytic (painless) thyroiditis; Graves disease; diffuse andmultinodular goiter including, but not limited to, diffuse nontoxic(simple) goiter and multinodular goiter; neoplasms of the thyroidincluding, but not limited to, adenomas, other benign tumors, andcarcinomas, which include, but are not limited to, papillary carcinoma,follicular carcinoma, medullary carcinoma, and anaplastic carcinoma; andcogenital anomalies.

Disorders involving the skeletal muscle include tumors, such asrhabdomyosarcoma.

Disorders involving the lung include, but are not limited to, congenitalanomalies; atelectasis; diseases of vascular origin, such as pulmonarycongestion and edema, including hemodynamic pulmonary edema and edemacaused by microvascular injury, adult respiratory distress syndrome(diffuse alveolar damage), pulmonary embolism, hemorrhage, andinfarction, and pulmonary hypertension and vascular sclerosis; chronicobstructive pulmonary disease, such as emphysema, chronic bronchitis,bronchial asthma, and bronchiectasis; diffuse interstitial(infiltrative, restrictive) diseases, such as pneumoconioses,sarcoidosis, idiopathic pulmonary fibrosis, desquamative interstitialpneumonitis, hypersensitivity pneumonitis, pulmonary eosinophilia(pulmonary infiltration with eosinophilia), Bronchiolitisobliterans-organizing pneumonia, diffuse pulmonary hemorrhage syndromes,including Goodpasture syndrome, idiopathic pulmonary hemosiderosis andother hemorrhagic syndromes, pulmonary involvement in collagen vasculardisorders, and pulmonary alveolar proteinosis; complications oftherapies, such as drug-induced lung disease, radiation-induced lungdisease, and lung transplantation; tumors, such as bronchogeniccarcinoma, including paraneoplastic syndromes, bronchioloalveolarcarcinoma, neuroendocrine tumors, such as bronchial carcinoid,miscellaneous tumors, and metastatic tumors; pathologies of the pleura,including inflammatory pleural effusions, noninflammatory pleuraleffusions, pneumothorax, and pleural tumors, including solitary fibroustumors (pleural fibroma) and malignant mesothelioma.

Disorders involving the colon include, but are not limited to,congenital anomalies, such as atresia and stenosis, Meckel diverticulum,congenital aganglionic megacolon-Hirschsprung disease; enterocolitis,such as diarrhea and dysentery, infectious enterocolitis, includingviral gastroenteritis, bacterial enterocolitis, necrotizingenterocolitis, antibiotic-associated colitis (pseudomembranous colitis),and collagenous and lymphocytic colitis, miscellaneous intestinalinflammatory disorders, including parasites and protozoa, acquiredimmunodeficiency syndrome, transplantation, drug-induced intestinalinjury, radiation enterocolitis, neutropenic colitis (typhlitis), anddiversion colitis; idiopathic inflammatory bowel disease, such as Crohndisease and ulcerative colitis; tumors of the colon, such asnon-neoplastic polyps, adenomas, familial syndromes, colorectalcarcinogenesis, colorectal carcinoma, and carcinoid tumors.

Disorders involving the ovary include, for example, polycystic ovariandisease, Stein-leventhal syndrome, Pseudomyxoma peritonei and stromalhyperthecosis; ovarian tumors such as, tumors of coelomic epithelium,serous tumors, mucinous tumors, endometeriod tumors, clear celladenocarcinoma, cystadenofibroma, brenner tumor, surface epithelialtumors; germ cell tumors such as mature (benign) teratomas, monodermalteratomas, immature malignant teratomas, dysgerminoma, endodermal sinustumor, choriocarcinoma; sex cord-stomal tumors such as, granulosa-thecacell tumors, thecoma-fibromas, androblastomas, hill cell tumors, andgonadoblastoma; and metastatic tumors such as Krukenberg tumors.

Bone-forming cells include the osteoprogenitor cells, osteoblasts, andosteocytes. The disorders of the bone are complex because they may havean impact on the skeleton during any of its stages of development.Hence, the disorders may have variable manifestations and may involveone, multiple or all bones of the body. Such disorders include,congenital malformations, achondroplasia and thanatophoric dwarfism,diseases associated with abnormal matix such as type 1 collagen disease,osteoporois, paget disease, rickets, osteomalacia, high-turnoverosteodystrophy, low-turnover of aplastic disease, osteonecrosis,pyogenic osteomyelitis, tuberculous osteomyelitism, osteoma, osteoidosteoma, osteoblastoma, osteosarcoma, osteochondroma, chondromas,chondroblastoma, chondromyxoid fibroma, chondrosarcoma, fibrous corticaldefects, fibrous dysplasia, fibrosarcoma, malignant fibroushistiocytoma, ewing sarcoma, primitive neuroectodermal tumor, giant celltumor, and metastatic tumors.

The ubiquitin-proteasome pathway has been implicated in the regulationof viral infection. Recent studies have shown that ubiquitination of theherpes simplex virus type I (HSV-1) transactivator protein ICP0 and thehepatitis B virus X protein (HBX) are influenced by theubiquitin-proteasome pathway during viral infection (Weber et al. (1999)Virology 253:288-98 and Hu et al. (1999) J Virol 73:7231-40). Inaddition, inactivation of the ubiquitin-proteasome pathway inhibitsVmw110, an immediate early protein of HSV-1, from stimulating lyticinfection. (Everett et al. (1998) EMBO J. 17:7161-9). Furthermore, acellular deubiquitinating enzyme, Herpes-virus associated ubiquitinspecific protease, HAUSP, has also been implicated in the regulation ofHSV infection (Everett et al. (1997) EMBO J. 16:1519-1530). Hence, theubiquitin protease find use in the treatment of disorders resulting fromviral infection.

Transcriptional profiling and Taqman profiling techniques showed thatthe expression of the ubiquitin protease of the present invention wasupregulated in HSV-infected human ganglia cells compared to uninfectedganglia. Furthermore, cell lines that express a hepatitis B virus(HepG2.215) showed higher expression levels of the ubiquitin protease23484 when compared to the parental HepG2 control cell line. Theubiquitin protease 23484 is therefore an important host gene for HSV andHVB pathogenesis and finds use in the treatment of disorders resultingfrom herpes simplex virus and hepatitis B infection.

Additional disorders in which the ubiquitin protease expression isrelevant include, but are not limited to the following:

Respiratory viral pathogens and their associated disorders include, forexample, adenovirus, resulting in upper and lower respiratory tractinfections; conjuctivitis and diarrhea; echovirus, resulting in upperrespiratory tract infections, pharyngitis and rash; rhinovirus,resulting in upper respiratory tract infections; cosackievirus,resulting in Pleurodynia, herpangia, hand-foot-mouth disease;coronavirus, resulting in upper respiratory tract infections; influenzaA and B viruses, resulting in influenza; parainfluenza virus 1-4,resulting in upper and lower respiratory tract infections and croup;respiratory syncytial virus, resulting in bronchiolitis and pneumonia.

Digestive viral pathogens and their associated disorders include, forexample, mumps virus, resulting in mumps, pancreatitis, and orchitis;rotavirus, resulting in childhood diarrhea; Norwalk Agent, resulting ingastroenteritis; hepatitis A virus, resulting in acute viral hepatitis;hepatitis B virus, hepatitis D virus and hepatitis C virus, resulting inacute or chronic hepatitis; hepatitis E virus, resulting in entericallytransmitted hepatitis.

Systemic viral pathogens associated with disorders involving skineruptions include, for example, measles virus, resulting in measles(rubeola); rubella virus, resulting in German measles (rubella);parvovirus, resulting in erythema infectiosum and aplastic anemia;varicella-zoster virus, resulting in chicken pox and shingles; herpessimplex virus 1-associated, resulting in cold sores; and herpes simplexvirus 2, resulting in genital herpes.

Systemic viral pathogens associated with hematopoietic disordersinclude, for example, cytomegalovirus, resulting in cytomegalicinclusion disease; Epstein-Barr virus, resulting in mononucleosis;HTLV-1, resulting in adult T-cell leukemia and tropical spasticparaparesis; HTLV-II; and HIV 1 and HIV 2, resulting in AIDS.

Arboviral pathogens associated with hemorrhagic fevers include, forexample, dengue virus 1-4, resulting in dengue and hemorrhagic fever;yellow fever virus, resulting in yellow fever; Colorado tick fevervirus, resulting in Colorado tick fever; and regional hemorrhagic feverviruses, resulting in Bolivian, Argentinian, Lassa fever.

Viral pathogens associated with warty growths and other hyperplasiasinclude, for example, papillomavirus, resulting in condyloma andcervical carcinoma; and molluscum virus, resulting in molluscumcontagiosum.

Viral pathogens associated with central nervous system disordersinclude, for example, poliovirus, resulting in poliomyelitis;rabiesvirus, associated with rabies; JC virus, associated withprogressive multifocal leukoencephalophathy; and arboviral encephalitisviruses, resulting in Eastern, Western, Venezuelan, St. Louis, orCalifornia group encephalitis.

Viral pathogens associated with cancer include, for example, humanpapillomaviruses, implicated in the genesis of several cancers includingsquamous cell carcinoma of the cervix and anogenital region, oral cancerand laryngeal cancers; Epstein-Barr virus, implicated in pathogenesis ofthe African form of Burkitt lymphoma, B-cell lymphomas, Hodgkin disease,and nasopharyngeal carcinomas; hepatitis B virus, implicated in livercancer; human T-cell leukemia virus type 1 (HTLV-1), associated withT-cell leukemia/lymphoma; and the Kaposi sarcoma herpesvirus (KSHV).

The ubiquitin protease polypeptides are thus useful for treating aubiquitin protease-associated disorder characterized by aberrantexpression or activity of a ubiquitin protease. The polypeptides canalso be useful for treating a disorder characterized by excessiveamounts of polyubiquitin or ubiquitinated substrate/remnant/amino acid.In one embodiment, the method involves administering an agent (e.g., anagent identified by a screening assay described herein), or combinationof agents that modulates (e.g., upregulates or downregulates) expressionor activity of the protein. In another embodiment, the method involvesadministering the ubiquitin protease as therapy to compensate forreduced or aberrant expression or activity of the protein.

Methods for treatment include but are not limited to the use of solubleubiquitin protease or fragments of the ubiquitin protease protein thatcompete for substrates including those disclosed herein. These ubiquitinproteases or fragments can have a higher affinity for the target so asto provide effective competition.

Stimulation of activity is desirable in situations in which the proteinis abnormally downregulated and/or in which increased activity is likelyto have a beneficial effect, such as virally-infected cells. Likewise,inhibition of activity is desirable in situations in which the proteinis abnormally upregulated and/or in which decreased activity is likelyto have a beneficial effect. In one example of such a situation, asubject has a disorder characterized by aberrant development or cellulardifferentiation. In another example, the subject has a proliferativedisease (e.g., cancer) or a disorder characterized by an aberranthematopoietic response. In another example, it is desirable to achievetissue regeneration in a subject (e.g., where a subject has undergonebrain or spinal cord injury and it is desirable to regenerate neuronaltissue in a regulated manner).

In yet another aspect of the invention, the proteins of the inventioncan be used as “bait proteins” in a two-hybrid assay or three-hybridassay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartelet al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene8:1693-1696; and Brent WO 94/10300), to identify other proteins(captured proteins) which bind to or interact with the proteins of theinvention and modulate their activity.

The ubiquitin protease polypeptides also are useful to provide a targetfor diagnosing a disease or predisposition to disease mediated by theubiquitin protease, including, but not limited to, diseases involvingtissues in which the ubiquitin proteases are expressed as disclosedherein, such as breast, lung, and liver cancer (colon metastases).Accordingly, methods are provided for detecting the presence, or levelsof, the ubiquitin protease in a cell, tissue, or organism. The methodinvolves contacting a biological sample with a compound capable ofinteracting with the ubiquitin protease such that the interaction can bedetected.

The polypeptides are also useful for treating a disorder characterizedby reduced amounts of these components. Thus, increasing or decreasingthe activity of the protease is beneficial to treatment. Thepolypeptides are also useful to provide a target for diagnosing adisease characterized by excessive substrate or reduced levels ofsubstrate. Accordingly, where substrate is excessive, use of theprotease polypeptides can provide a diagnostic assay. Furthermore, forexample, proteases having reduced activity can be used to diagnoseconditions in which reduced substrate is responsible for the disorder.

One agent for detecting ubiquitin protease is an antibody capable ofselectively binding to ubiquitin protease. A biological sample includestissues, cells and biological fluids isolated from a subject, as well astissues, cells and fluids present within a subject.

The ubiquitin protease also provides a target for diagnosing activedisease, or predisposition to disease, in a patient having a variantubiquitin protease. Thus, ubiquitin protease can be isolated from abiological sample and assayed for the presence of a genetic mutationthat results in an aberrant protein. This includes amino acidsubstitution, deletion, insertion, rearrangement, (as the result ofaberrant splicing events), and inappropriate post-translationalmodification. Analytic methods include altered electrophoretic mobility,altered tryptic peptide digest, altered ubiquitin protease activity incell-based or cell-free assay, alteration in binding to or hydrolysis ofpolyubiquitin, binding to ubiquitinated substrate protein or hydrolysisof the ubiquitin from the protein, binding to ubiquitinated proteinremnant, including peptide or amino acid, and hydrolysis of theubiquitin from the remnant, general protein turnover, specific proteinturnover, antibody-binding pattern, altered isoelectric point, directamino acid sequencing, and any other of the known assay techniquesuseful for detecting mutations in a protein in general or in a ubiquitinprotease specifically, including assays discussed herein.

In vitro techniques for detection of ubiquitin protease include enzymelinked immunosorbent assays (ELISAs), Western blots,immunoprecipitations and immunofluorescence. Alternatively, the proteincan be detected in vivo in a subject by introducing into the subject alabeled anti-ubiquitin protease antibody. For example, the antibody canbe labeled with a radioactive marker whose presence and location in asubject can be detected by standard imaging techniques. Particularlyuseful are methods, which detect the allelic variant of the ubiquitinprotease expressed in a subject, and methods, which detect fragments ofthe ubiquitin protease in a sample.

The ubiquitin protease polypeptides are also useful in pharmacogenomicanalysis. Pharmacogenomics deal with clinically significant hereditaryvariations in the response to drugs due to altered drug disposition andabnormal action in affected persons. See, e.g., Eichelbaum, M. (1996)Clin. Exp. Pharmacol. Physiol. 23(10-11):983-985, and Linder, M. W.(1997) Clin. Chem. 43(2):254-266. The clinical outcomes of thesevariations result in severe toxicity of therapeutic drugs in certainindividuals or therapeutic failure of drugs in certain individuals as aresult of individual variation in metabolism. Thus, the genotype of theindividual can determine the way a therapeutic compound acts on the bodyor the way the body metabolizes the compound. Further, the activity ofdrug metabolizing enzymes affects both the intensity and duration ofdrug action. Thus, the pharmacogenomics of the individual permit theselection of effective compounds and effective dosages of such compoundsfor prophylactic or therapeutic treatment based on the individual'sgenotype. The discovery of genetic polymorphisms in some drugmetabolizing enzymes has explained why some patients do not obtain theexpected drug effects, show an exaggerated drug effect, or experienceserious toxicity from standard drug dosages. Polymorphisms can beexpressed in the phenotype of the extensive metabolizer and thephenotype of the poor metabolizer. Accordingly, genetic polymorphism maylead to allelic protein variants of the ubiquitin protease in which oneor more of the ubiquitin protease functions in one population isdifferent from those in another population. The polypeptides thus allowa target to ascertain a genetic predisposition that can affect treatmentmodality. Thus, in a ubiquitin-based treatment, polymorphism may giverise to catalytic regions that are more or less active. Accordingly,dosage would necessarily be modified to maximize the therapeutic effectwithin a given population containing the polymorphism. As an alternativeto genotyping, specific polymorphic polypeptides could be identified.

The ubiquitin protease polypeptides are also useful for monitoringtherapeutic effects during clinical trials and other treatment. Thus,the therapeutic effectiveness of an agent that is designed to increaseor decrease gene expression, protein levels or ubiquitin proteaseactivity can be monitored over the course of treatment using theubiquitin protease polypeptides as an end-point target. The monitoringcan be, for example, as follows: (i) obtaining a pre-administrationsample from a subject prior to administration of the agent; (ii)detecting the level of expression or activity of the protein in thepre-administration sample; (iii) obtaining one or morepost-administration samples from the subject; (iv) detecting the levelof expression or activity of the protein in the post-administrationsamples; (v) comparing the level of expression or activity of theprotein in the pre-administration sample with the protein in thepost-administration sample or samples; and (vi) increasing or decreasingthe administration of the agent to the subject accordingly.

Antibodies

The invention also provides antibodies that selectively bind to theubiquitin protease and its variants and fragments. An antibody isconsidered to selectively bind, even if it also binds to other proteinsthat are not substantially homologous with the ubiquitin protease. Theseother proteins share homology with a fragment or domain of the ubiquitinprotease. This conservation in specific regions gives rise to antibodiesthat bind to both proteins by virtue of the homologous sequence. In thiscase, it would be understood that antibody binding to the ubiquitinprotease is still selective.

To generate antibodies, an isolated ubiquitin protease polypeptide isused as an immunogen to generate antibodies using standard techniquesfor polyclonal and monoclonal antibody preparation. Either thefull-length protein or antigenic peptide fragment can be used. Regionshaving a high antigenicity index are shown in FIG. 34.

Antibodies are preferably prepared from these regions or from discretefragments in these regions. However, antibodies can be prepared from anyregion of the peptide as described herein. A preferred fragment producesan antibody that diminishes or completely prevents substrate hydrolysisor binding. Antibodies can be developed against the entire ubiquitinprotease or domains of the ubiquitin protease as described herein.Antibodies can also be developed against specific functional sites asdisclosed herein.

The antigenic peptide can comprise a contiguous sequence of at least 12,14, 15, or amino acid residues. In one embodiment, fragments correspondto regions that are located on the surface of the protein, e.g.,hydrophilic regions. These fragments are not to be construed, however,as encompassing any fragments, which may be disclosed prior to theinvention.

Antibodies can be polyclonal or monoclonal. An intact antibody, or afragment thereof (e.g., Fab or F(ab′)₂) can be used.

Detection can be facilitated by coupling (i.e., physically linking) theantibody to a detectable substance. Examples of detectable substancesinclude various enzymes, prosthetic groups, fluorescent materials,luminescent materials, bioluminescent materials, and radioactivematerials. Examples of suitable enzymes include horseradish peroxidase,alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examplesof suitable prosthetic group complexes include streptavidin/biotin andavidin/biotin; examples of suitable fluorescent materials includeumbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine,dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; anexample of a luminescent material includes luminol; examples ofbioluminescent materials include luciferase, luciferin, and aequorin,and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or³H.

An appropriate immunogenic preparation can be derived from native,recombinantly expressed, or chemically synthesized peptides.

Antibody Uses

The antibodies can be used to isolate a ubiquitin protease by standardtechniques, such as affinity chromatography or immunoprecipitation. Theantibodies can facilitate the purification of the natural ubiquitinprotease from cells and recombinantly produced ubiquitin proteaseexpressed in host cells.

The antibodies are useful to detect the presence of ubiquitin proteasein cells or tissues to determine the pattern of expression of theubiquitin protease among various tissues in an organism and over thecourse of normal development.

The antibodies can be used to detect ubiquitin protease in situ, invitro, or in a cell lysate or supernatant in order to evaluate theabundance and pattern of expression.

The antibodies can be used to assess abnormal tissue distribution orabnormal expression during development.

Antibody detection of circulating fragments of the full-length ubiquitinprotease can be used to identify ubiquitin protease turnover.

Further, the antibodies can be used to assess ubiquitin proteaseexpression in disease states such as in active stages of the disease orin an individual with a predisposition toward disease related toubiquitin or ubiquitin protease function. When a disorder is caused byan inappropriate tissue distribution, developmental expression, or levelof expression of the ubiquitin protease protein, the antibody can beprepared against the normal ubiquitin protease protein. If a disorder ischaracterized by a specific mutation in the ubiquitin protease,antibodies specific for this mutant protein can be used to assay for thepresence of the specific mutant ubiquitin protease. However,intracellularly-made antibodies (“intrabodies”) are also encompassed,which would recognize intracellular ubiquitin protease peptide regions.

The antibodies can also be used to assess normal and aberrantsubcellular localization of cells in the various tissues in an organism.Antibodies can be developed against the whole ubiquitin protease orportions of the ubiquitin protease.

The diagnostic uses can be applied, not only in genetic testing, butalso in monitoring a treatment modality. Accordingly, where treatment isultimately aimed at correcting ubiquitin protease expression level orthe presence of aberrant ubiquitin proteases and aberrant tissuedistribution or developmental expression, antibodies directed againstthe ubiquitin protease or relevant fragments can be used to monitortherapeutic efficacy.

Antibodies accordingly can be used diagnostically to monitor proteinlevels in tissue as part of a clinical testing procedure, e.g., to, forexample, determine the efficacy of a given treatment regimen.

Additionally, antibodies are useful in pharmacogenomic analysis. Thus,antibodies prepared against polymorphic ubiquitin protease can be usedto identify individuals that require modified treatment modalities.

The antibodies are also useful as diagnostic tools as an immunologicalmarker for aberrant ubiquitin protease analyzed by electrophoreticmobility, isoelectric point, tryptic peptide digest, and other physicalassays known to those in the art.

The antibodies are also useful for tissue typing. Thus, where a specificubiquitin protease has been correlated with expression in a specifictissue, antibodies that are specific for this ubiquitin protease can beused to identify a tissue type.

The antibodies are also useful in forensic identification. Accordingly,where an individual has been correlated with a specific geneticpolymorphism resulting in a specific polymorphic protein, an antibodyspecific for the polymorphic protein can be used as an aid inidentification.

The antibodies are also useful for inhibiting ubiquitin proteasefunction, for example, blocking ubiquitin or polyubiquitin binding, orbinding to ubiquitinated substrate or substrate remnants.

These uses can also be applied in a therapeutic context in whichtreatment involves inhibiting ubiquitin protease function. An antibodycan be used, for example, to block ubiquitin binding. Antibodies can beprepared against specific fragments containing sites required forfunction or against intact ubiquitin protease associated with a cell.

Completely human antibodies are particularly desirable for therapeutictreatment of human patients. For an overview of this technology forproducing human antibodies, see Lonberg et al. (1995) Int. Rev. Immunol.13:65-93. For a detailed discussion of this technology for producinghuman antibodies and human monoclonal antibodies and protocols forproducing such antibodies, e.g., U.S. Pat. No. 5,625,126; U.S. Pat. No.5,633,425; U.S. Pat. No. 5,569,825; U.S. Pat. No. 5,661,016; and U.S.Pat. No. 5,545,806.

The invention also encompasses kits for using antibodies to detect thepresence of a ubiquitin protease protein in a biological sample. The kitcan comprise antibodies such as a labeled or labelable antibody and acompound or agent for detecting ubiquitin protease in a biologicalsample; means for determining the amount of ubiquitin protease in thesample; and means for comparing the amount of ubiquitin protease in thesample with a standard. The compound or agent can be packaged in asuitable container. The kit can further comprise instructions for usingthe kit to detect ubiquitin protease.

Polynucleotides

The nucleotide sequence in SEQ ID NO:16 was obtained by sequencing thedeposited human cDNA. Accordingly, the sequence of the deposited cloneis controlling as to any discrepancies between the two and any referenceto the sequence of SEQ ID NO:16 includes reference to the sequence ofthe deposited cDNA.

The specifically disclosed cDNA comprises the coding region and 5′ and3′ untranslated sequences in SEQ ID NO:16.

The invention provides isolated polynucleotides encoding the novelubiquitin protease. The term “ubiquitin protease polynucleotide” or“ubiquitin protease nucleic acid” refers to the sequence shown in SEQ IDNO:16 or in the deposited cDNA. The term “ubiquitin proteasepolynucleotide” or “ubiquitin protease nucleic acid” further includesvariants and fragments of the ubiquitin protease polynucleotide.

An “isolated” ubiquitin protease nucleic acid is one that is separatedfrom other nucleic acid present in the natural source of the ubiquitinprotease nucleic acid. Preferably, an “isolated” nucleic acid is free ofsequences which naturally flank the ubiquitin protease nucleic acid(i.e., sequences located at the 5′ and 3′ ends of the nucleic acid) inthe genomic DNA of the organism from which the nucleic acid is derived.However, there can be some flanking nucleotide sequences, for example upto about 5 KB. The important point is that the ubiquitin proteasenucleic acid is isolated from flanking sequences such that it can besubjected to the specific manipulations described herein, such asrecombinant expression, preparation of probes and primers, and otheruses specific to the ubiquitin protease nucleic acid sequences.

Moreover, an “isolated” nucleic acid molecule, such as a cDNA or RNAmolecule, can be substantially free of other cellular material, orculture medium when produced by recombinant techniques, or chemicalprecursors or other chemicals when chemically synthesized. However, thenucleic acid molecule can be fused to other coding or regulatorysequences and still be considered isolated.

In some instances, the isolated material will form part of a composition(for example, a crude extract containing other substances), buffersystem or reagent mix. In other circumstances, the material may bepurified to essential homogeneity, for example as determined by PAGE orcolumn chromatography such as HPLC. Preferably, an isolated nucleic acidcomprises at least about 50, 80 or 90% (on a molar basis) of allmacromolecular species present.

For example, recombinant DNA molecules contained in a vector areconsidered isolated. Further examples of isolated DNA molecules includerecombinant DNA molecules maintained in heterologous host cells orpurified (partially or substantially) DNA molecules in solution.Isolated RNA molecules include in vivo or in vitro RNA transcripts ofthe isolated DNA molecules of the present invention. Isolated nucleicacid molecules according to the present invention further include suchmolecules produced synthetically.

In some instances, the isolated material will form part of a composition(or example, a crude extract containing other substances), buffer systemor reagent mix. In other circumstances, the material may be purified toessential homogeneity, for example as determined by PAGE or columnchromatography such as HPLC. Preferably, an isolated nucleic acidcomprises at least about 50, 80 or 90% (on a molar basis) of allmacromolecular species present.

The ubiquitin protease polynucleotides can encode the mature proteinplus additional amino or carboxyterminal amino acids, or amino acidsinterior to the mature polypeptide (when the mature form has more thanone polypeptide chain, for instance). Such sequences may play a role inprocessing of a protein from precursor to a mature form, facilitateprotein trafficking, prolong or shorten protein half-life or facilitatemanipulation of a protein for assay or production, among other things.As generally is the case in situ, the additional amino acids may beprocessed away from the mature protein by cellular enzymes.

The ubiquitin protease polynucleotides include, but are not limited to,the sequence encoding the mature polypeptide alone, the sequenceencoding the mature polypeptide and additional coding sequences, such asa leader or secretory sequence (e.g., a pre-pro or pro-proteinsequence), the sequence encoding the mature polypeptide, with or withoutthe additional coding sequences, plus additional non-coding sequences,for example introns and non-coding 5′ and 3′ sequences such astranscribed but non-translated sequences that play a role intranscription, mRNA processing (including splicing and polyadenylationsignals), ribosome binding and stability of mRNA. In addition, thepolynucleotide may be fused to a marker sequence encoding, for example,a peptide that facilitates purification.

Ubiquitin protease polynucleotides can be in the form of RNA, such asmRNA, or in the form DNA, including cDNA and genomic DNA obtained bycloning or produced by chemical synthetic techniques or by a combinationthereof. The nucleic acid, especially DNA, can be double-stranded orsingle-stranded. Single-stranded nucleic acid can be the coding strand(sense strand) or the non-coding strand (anti-sense strand).

Ubiquitin protease nucleic acid can comprise the nucleotide sequenceshown in SEQ ID NO:16, corresponding to human cDNA.

In one embodiment, the ubiquitin protease nucleic acid comprises onlythe coding region.

The invention further provides variant ubiquitin proteasepolynucleotides, and fragments thereof, that differ from the nucleotidesequence shown in SEQ ID NO:16 due to degeneracy of the genetic code andthus encode the same protein as that encoded by the nucleotide sequenceshown in SEQ ID NO:16.

The invention also provides ubiquitin protease nucleic acid moleculesencoding the variant polypeptides described herein. Such polynucleotidesmay be naturally occurring, such as allelic variants (same locus),homologs (different locus), and orthologs (different organism), or maybe constructed by recombinant DNA methods or by chemical synthesis. Suchnon-naturally occurring variants may be made by mutagenesis techniques,including those applied to polynucleotides, cells, or organisms.Accordingly, as discussed above, the variants can contain nucleotidesubstitutions, deletions, inversions and insertions.

Typically, variants have a substantial identity with a nucleic acidmolecule of SEQ ID NO:16 and the complements thereof. Variation canoccur in either or both the coding and non-coding regions. Thevariations can produce both conservative and non-conservative amino acidsubstitutions.

Orthologs, homologs, and allelic variants can be identified usingmethods well known in the art. These variants comprise a nucleotidesequence encoding a ubiquitin protease that is at least about 60-65%,65-70%, typically at least about 70-75%, more typically at least about80-85%, and most typically at least about 90-95% or more homologous tothe nucleotide sequence shown in SEQ ID NO:16. Such nucleic acidmolecules can readily be identified as being able to hybridize understringent conditions, to the nucleotide sequence shown in SEQ ID NO:16or a fragment of the sequence. It is understood that stringenthybridization does not indicate substantial homology where it is due togeneral homology, such as poly A sequences, or sequences common to allor most proteins or all deubiquitinating enzymes. Moreover, it isunderstood that variants do not include any of the nucleic acidsequences that may have been disclosed prior to the invention.

As used herein, the term “hybridizes under stringent conditions” isintended to describe conditions for hybridization and washing underwhich nucleotide sequences encoding a polypeptide at least about 60-65%homologous to each other typically remain hybridized to each other. Theconditions can be such that sequences at least about 65%, at least about70%, at least about 75%, at least about 80%, at least about 90%, atleast about 95% or more identical to each other remain hybridized to oneanother. Such stringent conditions are known to those skilled in the artand can be found in Current Protocols in Molecular Biology, John Wiley &Sons, N.Y. (1989), 6.3.1-6.3.6, incorporated by reference. One exampleof stringent hybridization conditions are hybridization in 6× sodiumchloride/sodium citrate (SSC) at about 45° C., followed by one or morewashes in 0.2×SSC, 0.1% SDS at 50-65° C. In another non-limitingexample, nucleic acid molecules are allowed to hybridize in 6× sodiumchloride/sodium citrate (SSC) at about 45° C., followed by one or morelow stringency washes in 0.2×SSC/0.1% SDS at room temperature, or by oneor more moderate stringency washes in 0.2×SSC/0.1% SDS at 42° C., orwashed in 0.2×SSC/0.1% SDS at 65° C. for high stringency. In oneembodiment, an isolated nucleic acid molecule that hybridizes understringent conditions to the sequence of SEQ ID NO:15 corresponds to anaturally-occurring nucleic acid molecule. As used herein, a“naturally-occurring” nucleic acid molecule refers to an RNA or DNAmolecule having a nucleotide sequence that occurs in nature (e.g.,encodes a natural protein).

As understood by those of ordinary skill, the exact conditions can bedetermined empirically and depend on ionic strength, temperature and theconcentration of destabilizing agents such as formamide or denaturingagents such as SDS. Other factors considered in determining the desiredhybridization conditions include the length of the nucleic acidsequences, base composition, percent mismatch between the hybridizingsequences and the frequency of occurrence of subsets of the sequenceswithin other non-identical sequences. Thus, equivalent conditions can bedetermined by varying one or more of these parameters while maintaininga similar degree of identity or similarity between the two nucleic acidmolecules.

The present invention also provides isolated nucleic acids that containa single or double stranded fragment or portion that hybridizes understringent conditions to the nucleotide sequence of SEQ ID NO:16 or thecomplement of SEQ ID NO:16. In one embodiment, the nucleic acid consistsof a portion of the nucleotide sequence of SEQ ID NO:16 or thecomplement of SEQ ID NO:16.

It is understood that isolated fragments include any contiguous sequencenot disclosed prior to the invention as well as sequences that aresubstantially the same and which are not disclosed. Accordingly, if afragment is disclosed prior to the present invention, that fragment isnot intended to be encompassed by the invention. When a sequence is notdisclosed prior to the present invention, an isolated nucleic acidfragment is at least about 15, preferably at least about 18, 20, 23 or25 nucleotides, and can be 30, 40, 50, 100, 200, 500 or more nucleotidesin length.

For example, nucleotide sequences 1 to about 269, about 761 to about817, about 994 to about 1554, and about 1735 to about 2314 are notdisclosed prior to the invention. The nucleotide sequence from about 269to 761 encompasses fragments greater than 14, 18, 20, 23 or 25nucleotides; the nucleotide sequence from about 817 to about 994encompasses fragments greater than 6, 10, 15, 20, or 25 nucleotides; thenucleotide sequences from about 1154 to 1735 encompasses fragmentsgreater than 13, 18, 20, 23 or 25 nucleotides; and the nucleotidesequence from about 2314 to about 2520 encompasses fragments greaterthan 33, 40, 45, or 50 nucleotides. Longer fragments, for example, 30 ormore nucleotides in length, which encode antigenic proteins orpolypeptides described herein are useful.

Furthermore, the invention provides polynucleotides that comprise afragment of the full-length ubiquitin protease polynucleotides. Thefragment can be single or double-stranded and can comprise DNA or RNA.The fragment can be derived from either the coding or the non-codingsequence.

In another embodiment an isolated ubiquitin protease nucleic acidencodes the entire coding region. Other fragments include nucleotidesequences encoding the amino acid fragments described herein.

Thus, ubiquitin protease nucleic acid fragments further includesequences corresponding to the domains described herein, subregions alsodescribed, and specific functional sites. Ubiquitin protease nucleicacid fragments also include combinations of the domains, segments, andother functional sites described above. A person of ordinary skill inthe art would be aware of the many permutations that are possible.

Where the location of the domains or sites have been predicted bycomputer analysis, one of ordinary sill would appreciate that the aminoacid residues constituting these domains can vary depending on thecriteria used to define the domains.

However, it is understood that a ubiquitin protease fragment includesany nucleic acid sequence that does not include the entire gene.

The invention also provides ubiquitin protease nucleic acid fragmentsthat encode epitope bearing regions of the ubiquitin protease proteinsdescribed herein.

Nucleic acid fragments, according to the present invention, are not tobe construed as encompassing those fragments that may have beendisclosed prior to the invention.

Polynucleotide Uses

The nucleotide sequences of the present invention can be used as a“query sequence” to perform a search against public databases, forexample, to identify other family members or related sequences. Suchsearches can be performed using the NBLAST and XBLAST programs (version2.0) of Altschul et al. (1990) J. Mol. Biol. 215:403-10. BLAST proteinsearches can be performed with the XBLAST program, score=50,wordlength=3 to obtain amino acid sequences homologous to the proteinsof the invention. To obtain gapped alignments for comparison purposes,Gapped BLAST can be utilized as described in Altschul et al. (1997)Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST and GappedBLAST programs, the default parameters of the respective programs (e.g.,XBLAST and NBLAST) can be used.

The nucleic acid fragments of the invention provide probes or primers inassays such as those described below. “Probes” are oligonucleotides thathybridize in a base-specific manner to a complementary strand of nucleicacid. Such probes include polypeptide nucleic acids, as described inNielsen et al. (1991) Science 254:1497-1500. Typically, a probecomprises a region of nucleotide sequence that hybridizes under highlystringent conditions to at least about 15, typically about 20-25, andmore typically about 40, 50 or 75 consecutive nucleotides of the nucleicacid sequence shown in SEQ ID NO:16 and the complements thereof. Moretypically, the probe further comprises a label, e.g., radioisotope,fluorescent compound, enzyme, or enzyme co-factor.

As used herein, the term “primer” refers to a single-strandedoligonucleotide which acts as a point of initiation of template-directedDNA synthesis using well-known methods (e.g., PCR, LCR) including, butnot limited to those described herein. The appropriate length of theprimer depends on the particular use, but typically ranges from about 15to 30 nucleotides. The term “primer site” refers to the area of thetarget DNA to which a primer hybridizes. The term “primer pair” refersto a set of primers including a 5′ (upstream) primer that hybridizeswith the 5′ end of the nucleic acid sequence to be amplified and a 3′(downstream) primer that hybridizes with the complement of the sequenceto be amplified.

The ubiquitin protease polynucleotides are thus useful for probes,primers, and in biological assays.

Where the polynucleotides are used to assess ubiquitin proteaseproperties or functions, such as in the assays described herein, all orless than all of the entire cDNA can be useful. Assays specificallydirected to ubiquitin protease functions, such as assessing agonist orantagonist activity, encompass the use of known fragments. Further,diagnostic methods for assessing ubiquitin protease function can also bepracticed with any fragment, including those fragments that may havebeen known prior to the invention. Similarly, in methods involvingtreatment of ubiquitin protease dysfunction, all fragments areencompassed including those, which may have been known in the art.

The ubiquitin protease polynucleotides are useful as a hybridizationprobe for cDNA and genomic DNA to isolate a full-length cDNA and genomicclones encoding the polypeptide described in SEQ ID NO:15 and to isolatecDNA and genomic clones that correspond to variants producing the samepolypeptide shown in SEQ ID NO:15 or the other variants describedherein. Variants can be isolated from the same tissue and organism fromwhich the polypeptides shown in SEQ ID NO:15 were isolated, differenttissues from the same organism, or from different organisms. This methodis useful for isolating genes and cDNA that aredevelopmentally-controlled and therefore may be expressed in the sametissue or different tissues at different points in the development of anorganism.

The probe can correspond to any sequence along the entire length of thegene encoding the ubiquitin protease. Accordingly, it could be derivedfrom 5′ noncoding regions, the coding region, and 3′ noncoding regions.

The nucleic acid probe can be, for example, the full-length cDNA of SEQID NO:16 or a fragment thereof that is sufficient to specificallyhybridize under stringent conditions to mRNA or DNA.

Fragments of the polynucleotides described herein are also useful tosynthesize larger fragments or full-length polynucleotides describedherein. For example, a fragment can be hybridized to any portion of anmRNA and a larger or full-length cDNA can be produced.

The fragments are also useful to synthesize antisense molecules ofdesired length and sequence.

Antisense nucleic acids of the invention can be designed using thenucleotide sequence of SEQ ID NO:16, and constructed using chemicalsynthesis and enzymatic ligation reactions using procedures known in theart. For example, an antisense nucleic acid (e.g., an antisenseoligonucleotide) can be chemically synthesized using naturally occurringnucleotides or variously modified nucleotides designed to increase thebiological stability of the molecules or to increase the physicalstability of the duplex formed between the antisense and sense nucleicacids, e.g., phosphorothioate derivatives and acridine substitutednucleotides can be used. Examples of modified nucleotides which can beused to generate the antisense nucleic acid include 5-fluorouracil,5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine,4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil,5-carboxymethylaminomethyl-2-thiouridine,5-carboxymethylaminomethyluracil, dihydrouracil,beta-D-galactosylqueosine, inosine, N6-isopentenyladenine,1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine,2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine,7-methylguanine, 5-methylaminomethyluracil,5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine,5′-methoxycarboxymethyluracil, 5-methoxyuracil,2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v),wybutoxosine, pseudouracil, queosine, 2-thiocytosine,5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil,uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v),5-methyl-2-thiouracil, 3-(3-amino-3-N2-carboxypropyl) uracil, (acp3)w,and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can beproduced biologically using an expression vector into which a nucleicacid has been subcloned in an antisense orientation (i.e., RNAtranscribed from the inserted nucleic acid will be of an antisenseorientation to a target nucleic acid of interest).

Additionally, the nucleic acid molecules of the invention can bemodified at the base moiety, sugar moiety or phosphate backbone toimprove, e.g., the stability, hybridization, or solubility of themolecule. For example, the deoxyribose phosphate backbone of the nucleicacids can be modified to generate peptide nucleic acids (see Hyrup etal. (1996) Bioorganic & Medicinal Chemistry 4:5). As used herein, theterms “peptide nucleic acids” or “PNAs” refer to nucleic acid mimics,e.g., DNA mimics, in which the deoxyribose phosphate backbone isreplaced by a pseudopeptide backbone and only the four naturalnucleobases are retained. The neutral backbone of PNAs has been shown toallow for specific hybridization to DNA and RNA under conditions of lowionic strength. The synthesis of PNA oligomers can be performed usingstandard solid phase peptide synthesis protocols as described in Hyrupet al. (1996), supra; Perry-O'Keefe et al. (1996) Proc. Natl. Acad. Sci.USA 93:14670. PNAs can be further modified, e.g., to enhance theirstability, specificity or cellular uptake, by attaching lipophilic orother helper groups to PNA, by the formation of PNA-DNA chimeras, or bythe use of liposomes or other techniques of drug delivery known in theart. The synthesis of PNA-DNA chimeras can be performed as described inHyrup (1996), supra, Finn et al. (1996) Nucleic Acids Res.24(17):3357-63, Mag et al. (1989) Nucleic Acids Res. 17:5973, andPeterser et al. (1975) Bioorganic Med. Chem. Lett. 5:1119.

The nucleic acid molecules and fragments of the invention can alsoinclude other appended groups such as peptides (e.g., for targeting hostcell ubiquitin proteases in vivo), or agents facilitating transportacross the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl.Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad.Sci. USA 84:648-652; PCT Publication No. WO 88/0918) or the blood brainbarrier (see, e.g., PCT Publication No. WO 89/10134). In addition,oligonucleotides can be modified with hybridization-triggered cleavageagents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) orintercalating agents (see, e.g., Zon (1988) Pharm Res. 5:539-549).

The ubiquitin protease polynucleotides are also useful as primers forPCR to amplify any given region of a ubiquitin protease polynucleotide.

The ubiquitin protease polynucleotides are also useful for constructingrecombinant vectors. Such vectors include expression vectors thatexpress a portion of, or all of, the ubiquitin protease polypeptides.Vectors also include insertion vectors, used to integrate into anotherpolynucleotide sequence, such as into the cellular genome, to alter insitu expression of ubiquitin protease genes and gene products. Forexample, an endogenous ubiquitin protease coding sequence can bereplaced via homologous recombination with all or part of the codingregion containing one or more specifically introduced mutations.

The ubiquitin protease polynucleotides are also useful for expressingantigenic portions of the ubiquitin protease proteins.

The ubiquitin protease polynucleotides are also useful as probes fordetermining the chromosomal positions of the ubiquitin proteasepolynucleotides by means of in situ hybridization methods, such as FISH.(For a review of this technique, see Verma et al. (1988) HumanChromosomes: A Manual of Basic Techniques (Pergamon Press, New York),and PCR mapping of somatic cell hybrids. The mapping of the sequences tochromosomes is an important first step in correlating these sequenceswith genes associated with disease.

Reagents for chromosome mapping can be used individually to mark asingle chromosome or a single site on that chromosome, or panels ofreagents can be used for marking multiple sites and/or multiplechromosomes. Reagents corresponding to noncoding regions of the genesactually are preferred for mapping purposes. Coding sequences are morelikely to be conserved within gene families, thus increasing the chanceof cross hybridizations during chromosomal mapping.

Once a sequence has been mapped to a precise chromosomal location, thephysical position of the sequence on the chromosome can be correlatedwith genetic map data. (Such data are found, for example, in V.McKusick, Mendelian Inheritance in Man, available on-line through JohnsHopkins University Welch Medical Library). The relationship between agene and a disease mapped to the same chromosomal region, can then beidentified through linkage analysis (co-inheritance of physicallyadjacent genes), described in, for example, Egeland et al. ((1987)Nature 325:783-787).

Moreover, differences in the DNA sequences between individuals affectedand unaffected with a disease associated with a specified gene, can bedetermined. If a mutation is observed in some or all of the affectedindividuals but not in any unaffected individuals, then the mutation islikely to be the causative agent of the particular disease. Comparisonof affected and unaffected individuals generally involves first lookingfor structural alterations in the chromosomes, such as deletions ortranslocations, that are visible from chromosome spreads, or detectableusing PCR based on that DNA sequence. Ultimately, complete sequencing ofgenes from several individuals can be performed to confirm the presenceof a mutation and to distinguish mutations from polymorphisms.

The ubiquitin protease polynucleotide probes are also useful todetermine patterns of the presence of the gene encoding the ubiquitinproteases and their variants with respect to tissue distribution, forexample, whether gene duplication has occurred and whether theduplication occurs in all or only a subset of tissues. The genes can benaturally occurring or can have been introduced into a cell, tissue, ororganism exogenously.

The ubiquitin protease polynucleotides are also useful for designingribozymes corresponding to all, or a part, of the mRNA produced fromgenes encoding the polynucleotides described herein.

The ubiquitin protease polynucleotides are also useful for constructinghost cells expressing a part, or all, of the ubiquitin proteasepolynucleotides and polypeptides.

The ubiquitin protease polynucleotides are also useful for constructingtransgenic animals expressing all, or a part, of the ubiquitin proteasepolynucleotides and polypeptides.

The ubiquitin protease polynucleotides are also useful for makingvectors that express part, or all, of the ubiquitin proteasepolypeptides.

The ubiquitin protease polynucleotides are also useful as hybridizationprobes for determining the level of ubiquitin protease nucleic acidexpression. Accordingly, the probes can be used to detect the presenceof, or to determine levels of, ubiquitin protease nucleic acid in cells,tissues, and in organisms. The nucleic acid whose level is determinedcan be DNA or RNA. Accordingly, probes corresponding to the polypeptidesdescribed herein can be used to assess gene copy number in a given cell,tissue, or organism. This is particularly relevant in cases in whichthere has been an amplification of the ubiquitin protease genes.

Alternatively, the probe can be used in an in situ hybridization contextto assess the position of extra copies of the ubiquitin protease genes,as on extrachromosomal elements or as integrated into chromosomes inwhich the ubiquitin protease gene is not normally found, for example asa homogeneously staining region.

These uses are relevant for diagnosis of disorders involving an increaseor decrease in ubiquitin protease expression relative to normal, such asa proliferative disorder, a differentiative or developmental disorder,or a hematopoietic disorder.

Tissues and/or cells in which the ubiquitin protease is expressedinclude, but are not limited to those shown in FIGS. 37 and 38. Tissuesin which the gene is highly expressed include fetal kidney, testes,fetal liver, ovary, and fetal heart. Expression is also seen in thekidney, thyroid, undifferentiated osteoblasts and skeletal muscle. Theubiquitin protease is also expressed in normal liver and in normal andmalignant breast, lung, and colon tissue and in liver metastases derivedfrom malignant colonic tissues. The ubiquitin proteases are thusspecifically involved in breast, lung, and liver cancer.

As such, the gene is particularly relevant for the treatment ofdisorders involving these tissues. Disorders in which the ubiquitinprotease expression is relevant are disclosed herein above.

Furthermore, the ubiquitin protease is useful to treat viral infectionsand disorders resulting from viral infections. Such disorders arediscussed above.

Thus, the present invention provides a method for identifying a diseaseor disorder associated with aberrant expression or activity of ubiquitinprotease nucleic acid, in which a test sample is obtained from a subjectand nucleic acid (e.g., mRNA, genomic DNA) is detected, wherein thepresence of the nucleic acid is diagnostic for a subject having or atrisk of developing a disease or disorder associated with aberrantexpression or activity of the nucleic acid.

One aspect of the invention relates to diagnostic assays for determiningnucleic acid expression as well as activity in the context of abiological sample (e.g., blood, serum, cells, tissue) to determinewhether an individual has a disease or disorder, or is at risk ofdeveloping a disease or disorder, associated with aberrant nucleic acidexpression or activity. Such assays can be used for prognostic orpredictive purpose to thereby prophylactically treat an individual priorto the onset of a disorder characterized by or associated withexpression or activity of the nucleic acid molecules.

In vitro techniques for detection of mRNA include Northernhybridizations and in situ hybridizations. In vitro techniques fordetecting DNA includes Southern hybridizations and in situhybridization.

Probes can be used as a part of a diagnostic test kit for identifyingcells or tissues that express the ubiquitin protease, such as bymeasuring the level of a ubiquitin protease-encoding nucleic acid in asample of cells from a subject e.g., mRNA or genomic DNA, or determiningif the ubiquitin protease gene has been mutated.

Nucleic acid expression assays are useful for drug screening to identifycompounds that modulate ubiquitin protease nucleic acid expression(e.g., antisense, polypeptides, peptidomimetics, small molecules orother drugs). A cell is contacted with a candidate compound and theexpression of mRNA determined. The level of expression of the mRNA inthe presence of the candidate compound is compared to the level ofexpression of the mRNA in the absence of the candidate compound. Thecandidate compound can then be identified as a modulator of nucleic acidexpression based on this comparison and be used, for example to treat adisorder characterized by aberrant nucleic acid expression. Themodulator can bind to the nucleic acid or indirectly modulateexpression, such as by interacting with other cellular components thataffect nucleic acid expression.

Modulatory methods can be performed in vitro (e.g., by culturing thecell with the agent) or, alternatively, in vivo (e.g., by administeringthe gent to a subject) in patients or in transgenic animals.

The invention thus provides a method for identifying a compound that canbe used to treat a disorder associated with nucleic acid expression ofthe ubiquitin protease gene. The method typically includes assaying theability of the compound to modulate the expression of the ubiquitinprotease nucleic acid and thus identifying a compound that can be usedto treat a disorder characterized by undesired ubiquitin proteasenucleic acid expression.

The assays can be performed in cell-based and cell-free systems.Cell-based assays include cells naturally expressing the ubiquitinprotease nucleic acid or recombinant cells genetically engineered toexpress specific nucleic acid sequences.

Alternatively, candidate compounds can be assayed in vivo in patients orin transgenic animals.

The assay for ubiquitin protease nucleic acid expression can involvedirect assay of nucleic acid levels, such as mRNA levels, or oncollateral compounds involved in the pathway (such as free ubiquitinpool or protein turnover). Further, the expression of genes that are up-or down-regulated in response to the ubiquitin protease activity canalso be assayed. In this embodiment the regulatory regions of thesegenes can be operably linked to a reporter gene such as luciferase.

Thus, modulators of ubiquitin protease gene expression can be identifiedin a method wherein a cell is contacted with a candidate compound andthe expression of mRNA determined. The level of expression of ubiquitinprotease mRNA in the presence of the candidate compound is compared tothe level of expression of ubiquitin protease mRNA in the absence of thecandidate compound. The candidate compound can then be identified as amodulator of nucleic acid expression based on this comparison and beused, for example to treat a disorder characterized by aberrant nucleicacid expression. When expression of mRNA is statistically significantlygreater in the presence of the candidate compound than in its absence,the candidate compound is identified as a stimulator of nucleic acidexpression. When nucleic acid expression is statistically significantlyless in the presence of the candidate compound than in its absence, thecandidate compound is identified as an inhibitor of nucleic acidexpression.

Accordingly, the invention provides methods of treatment, with thenucleic acid as a target, using a compound identified through drugscreening as a gene modulator to modulate ubiquitin protease nucleicacid expression. Modulation includes both up-regulation (i.e. activationor agonization) or down-regulation (suppression or antagonization) oreffects on nucleic acid activity (e.g., when nucleic acid is mutated orimproperly modified). Treatment includes disorders characterized byaberrant expression or activity of the nucleic acid. In addition,disorders that are influenced by the ubiquitin protease may also betreated. Examples of such disorders are disclosed herein.

Alternatively, a modulator for ubiquitin protease nucleic acidexpression can be a small molecule or drug identified using thescreening assays described herein as long as the drug or small moleculeinhibits the ubiquitin protease nucleic acid expression.

The ubiquitin protease polynucleotides are also useful for monitoringthe effectiveness of modulating compounds on the expression or activityof the ubiquitin protease gene in clinical trials or in a treatmentregimen. Thus, the gene expression pattern can serve as a barometer forthe continuing effectiveness of treatment with the compound,particularly with compounds to which a patient can develop resistance.The gene expression pattern can also serve as a marker indicative of aphysiological response of the affected cells to the compound.Accordingly, such monitoring would allow either increased administrationof the compound or the administration of alternative compounds to whichthe patient has not become resistant. Similarly, if the level of nucleicacid expression falls below a desirable level, administration of thecompound could be commensurately decreased.

Monitoring can be, for example, as follows: (i) obtaining apre-administration sample from a subject prior to administration of theagent; (ii) detecting the level of expression of a specified mRNA orgenomic DNA of the invention in the pre-administration sample; (iii)obtaining one or more post-administration samples from the subject; (iv)detecting the level of expression or activity of the mRNA or genomic DNAin the post-administration samples; (v) comparing the level ofexpression or activity of the mRNA or genomic DNA in thepre-administration sample with the mRNA or genomic DNA in thepost-administration sample or samples; and (vi) increasing or decreasingthe administration of the agent to the subject accordingly.

The ubiquitin protease polynucleotides are also useful in diagnosticassays for qualitative changes in ubiquitin protease nucleic acid, andparticularly in qualitative changes that lead to pathology. Thepolynucleotides can be used to detect mutations in ubiquitin proteasegenes and gene expression products such as mRNA. The polynucleotides canbe used as hybridization probes to detect naturally-occurring geneticmutations in the ubiquitin protease gene and thereby to determinewhether a subject with the mutation is at risk for a disorder caused bythe mutation. Mutations include deletion, addition, or substitution ofone or more nucleotides in the gene, chromosomal rearrangement, such asinversion or transposition, modification of genomic DNA, such asaberrant methylation patterns or changes in gene copy number, such asamplification. Detection of a mutated form of the ubiquitin proteasegene associated with a dysfunction provides a diagnostic tool for anactive disease or susceptibility to disease when the disease resultsfrom overexpression, underexpression, or altered expression of aubiquitin protease.

Mutations in the ubiquitin protease gene can be detected at the nucleicacid level by a variety of techniques. Genomic DNA can be analyzeddirectly or can be amplified by using PCR prior to analysis. RNA or cDNAcan be used in the same way.

In certain embodiments, detection of the mutation involves the use of aprobe/primer in a polymerase chain reaction (PCR) (see, e.g., U.S. Pat.Nos. 4,683,195 and 4,683,202), such as anchor PCR or RACE PCR, or,alternatively, in a ligation chain reaction (LCR) (see, e.g., Landegranet al. (1988) Science 241:1077-1080; and Nakazawa et al. (1994) PNAS91:360-364), the latter of which can be particularly useful fordetecting point mutations in the gene (see Abravaya et al. (1995)Nucleic Acids Res. 23:675-682). This method can include the steps ofcollecting a sample of cells from a patient, isolating nucleic acid(e.g., genomic, mRNA or both) from the cells of the sample, contactingthe nucleic acid sample with one or more primers which specificallyhybridize to a gene under conditions such that hybridization andamplification of the gene (if present) occurs, and detecting thepresence or absence of an amplification product, or detecting the sizeof the amplification product and comparing the length to a controlsample. Deletions and insertions can be detected by a change in size ofthe amplified product compared to the normal genotype. Point mutationscan be identified by hybridizing amplified DNA to normal RNA orantisense DNA sequences.

It is anticipated that PCR and/or LCR may be desirable to use as apreliminary amplification step in conjunction with any of the techniquesused for detecting mutations described herein.

Alternative amplification methods include: self sustained sequencereplication (Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA87:1874-1878), transcriptional amplification system (Kwoh et al. (1989)Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi etal. (1988) Bio/Technology 6:1197), or any other nucleic acidamplification method, followed by the detection of the amplifiedmolecules using techniques well-known to those of skill in the art.These detection schemes are especially useful for the detection ofnucleic acid molecules if such molecules are present in very lownumbers.

Alternatively, mutations in a ubiquitin protease gene can be directlyidentified, for example, by alterations in restriction enzyme digestionpatterns determined by gel electrophoresis.

Further, sequence-specific ribozymes (U.S. Pat. No. 5,498,531) can beused to score for the presence of specific mutations by development orloss of a ribozyme cleavage site.

Perfectly matched sequences can be distinguished from mismatchedsequences by nuclease cleavage digestion assays or by differences inmelting temperature.

Sequence changes at specific locations can also be assessed by nucleaseprotection assays such as RNase and SI protection or the chemicalcleavage method.

Furthermore, sequence differences between a mutant ubiquitin proteasegene and a wild-type gene can be determined by direct DNA sequencing. Avariety of automated sequencing procedures can be utilized whenperforming the diagnostic assays ((1995) Biotechniques 19:448),including sequencing by mass spectrometry (see, e.g., PCT InternationalPublication No. WO 94/16101; Cohen et al. (1996) Adv. Chromatogr.36:127-162; and Griffin et al. (1993) Appl. Biochem. Biotechnol.38:147-159).

Other methods for detecting mutations in the gene include methods inwhich protection from cleavage agents is used to detect mismatched basesin RNA/RNA or RNA/DNA duplexes (Myers et al. (1985) Science 230:1242);Cotton et al. (1988) PNAS 85:4397; Saleeba et al. (1992) Meth. Enzymol.217:286-295), electrophoretic mobility of mutant and wild type nucleicacid is compared (Orita et al. (1989) PNAS 86:2766; Cotton et al. (1993)Mutat. Res. 285:125-144; and Hayashi et al. (1992) Genet. Anal. Tech.Appl. 9:73-79), and movement of mutant or wild-type fragments inpolyacrylamide gels containing a gradient of denaturant is assayed usingdenaturing gradient gel electrophoresis (Myers et al. (1985) Nature313:495). The sensitivity of the assay may be enhanced by using RNA(rather than DNA), in which the secondary structure is more sensitive toa change in sequence. In one embodiment, the subject method utilizesheteroduplex analysis to separate double stranded heteroduplex moleculeson the basis of changes in electrophoretic mobility (Keen et al. (1991)Trends Genet. 7:5). Examples of other techniques for detecting pointmutations include, selective oligonucleotide hybridization, selectiveamplification, and selective primer extension.

In other embodiments, genetic mutations can be identified by hybridizinga sample and control nucleic acids, e.g., DNA or RNA, to high densityarrays containing hundreds or thousands of oligonucleotide probes(Cronin et al. (1996) Human Mutation 7:244-255; Kozal et al. (1996)Nature Medicine 2:753-759). For example, genetic mutations can beidentified in two-dimensional arrays containing light-generated DNAprobes as described in Cronin et al. supra. Briefly, a firsthybridization array of probes can be used to scan through long stretchesof DNA in a sample and control to identify base changes between thesequences by making linear arrays of sequential overlapping probes. Thisstep allows the identification of point mutations. This step is followedby a second hybridization array that allows the characterization ofspecific mutations by using smaller, specialized probe arrayscomplementary to all variants or mutations detected. Each mutation arrayis composed of parallel probe sets, one complementary to the wild-typegene and the other complementary to the mutant gene.

The ubiquitin protease polynucleotides are also useful for testing anindividual for a genotype that while not necessarily causing thedisease, nevertheless affects the treatment modality. Thus, thepolynucleotides can be used to study the relationship between anindividual's genotype and the individual's response to a compound usedfor treatment (pharmacogenomic relationship). In the present case, forexample, a mutation in the ubiquitin protease gene that results inaltered affinity for ubiquitin could result in an excessive or decreaseddrug effect with standard concentrations of ubiquitin or analog.Accordingly, the ubiquitin protease polynucleotides described herein canbe used to assess the mutation content of the gene in an individual inorder to select an appropriate compound or dosage regimen for treatment.

Thus polynucleotides displaying genetic variations that affect treatmentprovide a diagnostic target that can be used to tailor treatment in anindividual. Accordingly, the production of recombinant cells and animalscontaining these polymorphisms allow effective clinical design oftreatment compounds and dosage regimens.

The methods can involve obtaining a control biological sample from acontrol subject, contacting the control sample with a compound or agentcapable of detecting mRNA, or genomic DNA, such that the presence ofmRNA or genomic DNA is detected in the biological sample, and comparingthe presence of mRNA or genomic DNA in the control sample with thepresence of mRNA or genomic DNA in the test sample.

The ubiquitin protease polynucleotides are also useful for chromosomeidentification when the sequence is identified with an individualchromosome and to a particular location on the chromosome. First, theDNA sequence is matched to the chromosome by in situ or otherchromosome-specific hybridization. Sequences can also be correlated tospecific chromosomes by preparing PCR primers that can be used for PCRscreening of somatic cell hybrids containing individual chromosomes fromthe desired species. Only hybrids containing the chromosome containingthe gene homologous to the primer will yield an amplified fragment.Sublocalization can be achieved using chromosomal fragments. Otherstrategies include prescreening with labeled flow-sorted chromosomes andpreselection by hybridization to chromosome-specific libraries. Furthermapping strategies include fluorescence in situ hybridization, whichallows hybridization with probes shorter than those traditionally used.Reagents for chromosome mapping can be used individually to mark asingle chromosome or a single site on the chromosome, or panels ofreagents can be used for marking multiple sites and/or multiplechromosomes. Reagents corresponding to noncoding regions of the genesactually are preferred for mapping purposes. Coding sequences are morelikely to be conserved within gene families, thus increasing the chanceof cross hybridizations during chromosomal mapping.

The ubiquitin protease polynucleotides can also be used to identifyindividuals based on small biological samples. This can be done forexample using restriction fragment-length polymorphism (RFLP) toidentify an individual. Thus, the polynucleotides described herein areuseful as DNA markers for RFLP (See U.S. Pat. No. 5,272,057).

Furthermore, the ubiquitin protease sequence can be used to provide analternative technique, which determines the actual DNA sequence ofselected fragments in the genome of an individual. Thus, the ubiquitinprotease sequences described herein can be used to prepare two PCRprimers from the 5′ and 3′ ends of the sequences. These primers can thenbe used to amplify DNA from an individual for subsequent sequencing.

Panels of corresponding DNA sequences from individuals prepared in thismanner can provide unique individual identifications, as each individualwill have a unique set of such DNA sequences. It is estimated thatallelic variation in humans occurs with a frequency of about once pereach 500 bases. Allelic variation occurs to some degree in the codingregions of these sequences, and to a greater degree in the noncodingregions. The ubiquitin protease sequences can be used to obtain suchidentification sequences from individuals and from tissue. The sequencesrepresent unique fragments of the human genome. Each of the sequencesdescribed herein can, to some degree, be used as a standard againstwhich DNA from an individual can be compared for identificationpurposes.

If a panel of reagents from the sequences is used to generate a uniqueidentification database for an individual, those same reagents can laterbe used to identify tissue from that individual. Using the uniqueidentification database, positive identification of the individual,living or dead, can be made from extremely small tissue samples.

The ubiquitin protease polynucleotides can also be used in forensicidentification procedures. PCR technology can be used to amplify DNAsequences taken from very small biological samples, such as a singlehair follicle, body fluids (e.g., blood, saliva, or semen). Theamplified sequence can then be compared to a standard allowingidentification of the origin of the sample.

The ubiquitin protease polynucleotides can thus be used to providepolynucleotide reagents, e.g., PCR primers, targeted to specific loci inthe human genome, which can enhance the reliability of DNA-basedforensic identifications by, for example, providing another“identification marker” (i.e. another DNA sequence that is unique to aparticular individual). As described above, actual base sequenceinformation can be used for identification as an accurate alternative topatterns formed by restriction enzyme generated fragments. Sequencestargeted to the noncoding region are particularly useful since greaterpolymorphism occurs in the noncoding regions, making it easier todifferentiate individuals using this technique.

The ubiquitin protease polynucleotides can further be used to providepolynucleotide reagents, e.g., labeled or labelable probes which can beused in, for example, an in situ hybridization technique, to identify aspecific tissue. This is useful in cases in which a forensic pathologistis presented with a tissue of unknown origin. Panels of ubiquitinprotease probes can be used to identify tissue by species and/or byorgan type.

In a similar fashion, these primers and probes can be used to screentissue culture for contamination (i.e., screen for the presence of amixture of different types of cells in a culture).

Alternatively, the ubiquitin protease polynucleotides can be useddirectly to block transcription or translation of ubiquitin proteasegene sequences by means of antisense or ribozyme constructs. Thus, in adisorder characterized by abnormally high or undesirable ubiquitinprotease gene expression, nucleic acids can be directly used fortreatment.

The ubiquitin protease polynucleotides are thus useful as antisenseconstructs to control ubiquitin protease gene expression in cells,tissues, and organisms. A DNA antisense polynucleotide is designed to becomplementary to a region of the gene involved in transcription,preventing transcription and hence production of ubiquitin proteaseprotein. An antisense RNA or DNA polynucleotide would hybridize to themRNA and thus block translation of mRNA into ubiquitin protease protein.

Examples of antisense molecules useful to inhibit nucleic acidexpression include antisense molecules complementary to a fragment ofthe 5′ untranslated region of SEQ ID NO:16 which also includes the startcodon and antisense molecules which are complementary to a fragment ofthe 3′ untranslated region of SEQ ID NO:16.

Alternatively, a class of antisense molecules can be used to inactivatemRNA in order to decrease expression of ubiquitin protease nucleic acid.Accordingly, these molecules can treat a disorder characterized byabnormal or undesired ubiquitin protease nucleic acid expression. Thistechnique involves cleavage by means of ribozymes containing nucleotidesequences complementary to one or more regions in the mRNA thatattenuate the ability of the mRNA to be translated. Possible regionsinclude coding regions and particularly coding regions corresponding tothe catalytic and other functional activities of the ubiquitin proteaseprotein.

The ubiquitin protease polynucleotides also provide vectors for genetherapy in patients containing cells that are aberrant in ubiquitinprotease gene expression. Thus, recombinant cells, which include thepatient's cells that have been engineered ex vivo and returned to thepatient, are introduced into an individual where the cells produce thedesired ubiquitin protease protein to treat the individual.

The invention also encompasses kits for detecting the presence of aubiquitin protease nucleic acid in a biological sample. For example, thekit can comprise reagents such as a labeled or labelable nucleic acid oragent capable of detecting ubiquitin protease nucleic acid in abiological sample; means for determining the amount of ubiquitinprotease nucleic acid in the sample; and means for comparing the amountof ubiquitin protease nucleic acid in the sample with a standard. Thecompound or agent can be packaged in a suitable container. The kit canfurther comprise instructions for using the kit to detect ubiquitinprotease mRNA or DNA.

Computer Readable Means

The nucleotide or amino acid sequences of the invention are alsoprovided in a variety of mediums to facilitate use thereof. As usedherein, “provided” refers to a manufacture, other than an isolatednucleic acid or amino acid molecule, which contains a nucleotide oramino acid sequence of the present invention. Such a manufactureprovides the nucleotide or amino acid sequences, or a subset thereof(e.g., a subset of open reading frames (ORFs)) in a form which allows askilled artisan to examine the manufacture using means not directlyapplicable to examining the nucleotide or amino acid sequences, or asubset thereof, as they exists in nature or in purified form.

In one application of this embodiment, a nucleotide or amino acidsequence of the present invention can be recorded on computer readablemedia. As used herein, “computer readable media” refers to any mediumthat can be read and accessed directly by a computer. Such mediainclude, but are not limited to: magnetic storage media, such as floppydiscs, hard disc storage medium, and magnetic tape; optical storagemedia such as CD-ROM; electrical storage media such as RAM and ROM; andhybrids of these categories such as magnetic/optical storage media. Theskilled artisan will readily appreciate how any of the presently knowncomputer readable mediums can be used to create a manufacture comprisingcomputer readable medium having recorded thereon a nucleotide or aminoacid sequence of the present invention.

As used herein, “recorded” refers to a process for storing informationon computer readable medium. The skilled artisan can readily adopt anyof the presently known methods for recording information on computerreadable medium to generate manufactures comprising the nucleotide oramino acid sequence information of the present invention.

A variety of data storage structures are available to a skilled artisanfor creating a computer readable medium having recorded thereon anucleotide or amino acid sequence of the present invention. The choiceof the data storage structure will generally be based on the meanschosen to access the stored information. In addition, a variety of dataprocessor programs and formats can be used to store the nucleotidesequence information of the present invention on computer readablemedium. The sequence information can be represented in a word processingtext file, formatted in commercially-available software such asWordPerfect and Microsoft Word, or represented in the form of an ASCIIfile, stored in a database application, such as DB2, Sybase, Oracle, orthe like. The skilled artisan can readily adapt any number ofdataprocessor structuring formats (e.g., text file or database) in orderto obtain computer readable medium having recorded thereon thenucleotide sequence information of the present invention.

By providing the nucleotide or amino acid sequences of the invention incomputer readable form, the skilled artisan can routinely access thesequence information for a variety of purposes. For example, one skilledin the art can use the nucleotide or amino acid sequences of theinvention in computer readable form to compare a target sequence ortarget structural motif with the sequence information stored within thedata storage means. Search means are used to identify fragments orregions of the sequences of the invention which match a particulartarget sequence or target motif.

As used herein, a “target sequence” can be any DNA or amino acidsequence of six or more nucleotides or two or more amino acids. Askilled artisan can readily recognize that the longer a target sequenceis, the less likely a target sequence will be present as a randomoccurrence in the database. The most preferred sequence length of atarget sequence is from about 10 to 100 amino acids or from about 30 to300 nucleotide residues. However, it is well recognized thatcommercially important fragments, such as sequence fragments involved ingene expression and protein processing, may be of shorter length.

As used herein, “a target structural motif,” or “target motif,” refersto any rationally selected sequence or combination of sequences in whichthe sequence(s) are chosen based on a three-dimensional configurationwhich is formed upon the folding of the target motif. There are avariety of target motifs known in the art. Protein target motifsinclude, but are not limited to, enzyme active sites and signalsequences. Nucleic acid target motifs include, but are not limited to,promoter sequences, hairpin structures and inducible expression elements(protein binding sequences).

Computer software is publicly available which allows a skilled artisanto access sequence information provided in a computer readable mediumfor analysis and comparison to other sequences. A variety of knownalgorithms are disclosed publicly and a variety of commerciallyavailable software for conducting search means are and can be used inthe computer-based systems of the present invention. Examples of suchsoftware include, but are not limited to, MacPattern (EMBL), BLASTN andBLASTX (NCBIA).

For example, software which implements the BLAST (Altschul et al. (1990)J. Mol. Biol. 215:403-410) and BLAZE (Brutlag et al. (1993) Comp. Chem.17:203-207) search algorithms on a Sybase system can be used to identifyopen reading frames (ORFs) of the sequences of the invention whichcontain homology to ORFs or proteins from other libraries. Such ORFs areprotein encoding fragments and are useful in producing commerciallyimportant proteins such as enzymes used in various reactions and in theproduction of commercially useful metabolites.

Vectors/Host Cells

The invention also provides vectors containing the ubiquitin proteasepolynucleotides. The term “vector” refers to a vehicle, preferably anucleic acid molecule that can transport the ubiquitin proteasepolynucleotides. When the vector is a nucleic acid molecule, theubiquitin protease polynucleotides are covalently linked to the vectornucleic acid. With this aspect of the invention, the vector includes aplasmid, single or double stranded phage, a single or double strandedRNA or DNA viral vector, or artificial chromosome, such as a BAC, PAC,YAC, OR MAC.

A vector can be maintained in the host cell as an extrachromosomalelement where it replicates and produces additional copies of theubiquitin protease polynucleotides. Alternatively, the vector mayintegrate into the host cell genome and produce additional copies of theubiquitin protease polynucleotides when the host cell replicates.

The invention provides vectors for the maintenance (cloning vectors) orvectors for expression (expression vectors) of the ubiquitin proteasepolynucleotides. The vectors can function in procaryotic or eukaryoticcells or in both (shuttle vectors).

Expression vectors contain cis-acting regulatory regions that areoperably linked in the vector to the ubiquitin protease polynucleotidessuch that transcription of the polynucleotides is allowed in a hostcell. The polynucleotides can be introduced into the host cell with aseparate polynucleotide capable of affecting transcription. Thus, thesecond polynucleotide may provide a trans-acting factor interacting withthe cis-regulatory control region to allow transcription of theubiquitin protease polynucleotides from the vector. Alternatively, atrans-acting factor may be supplied by the host cell. Finally, atrans-acting factor can be produced from the vector itself.

It is understood, however, that in some embodiments, transcriptionand/or translation of the ubiquitin protease polynucleotides can occurin a cell-free system.

The regulatory sequence to which the polynucleotides described hereincan be operably linked include promoters for directing mRNAtranscription. These include, but are not limited to, the left promoterfrom bacteriophage λ, the lac, TRP, and TAC promoters from E. coli, theearly and late promoters from SV40, the CMV immediate early promoter,the adenovirus early and late promoters, and retrovirus long-terminalrepeats.

In addition to control regions that promote transcription, expressionvectors may also include regions that modulate transcription, such asrepressor binding sites and enhancers. Examples include the SV40enhancer, the cytomegalovirus immediate early enhancer, polyomaenhancer, adenovirus enhancers, and retrovirus LTR enhancers.

In addition to containing sites for transcription initiation andcontrol, expression vectors can also contain sequences necessary fortranscription termination and, in the transcribed region a ribosomebinding site for translation. Other regulatory control elements forexpression include initiation and termination codons as well aspolyadenylation signals. The person of ordinary skill in the art wouldbe aware of the numerous regulatory sequences that are useful inexpression vectors. Such regulatory sequences are described, forexample, in Sambrook et al. (1989) Molecular Cloning: A LaboratoryManual 2nd. ed., Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y.).

A variety of expression vectors can be used to express a ubiquitinprotease polynucleotide. Such vectors include chromosomal, episomal, andvirus-derived vectors, for example vectors derived from bacterialplasmids, from bacteriophage, from yeast episomes, from yeastchromosomal elements, including yeast artificial chromosomes, fromviruses such as baculoviruses, papovaviruses such as SV40, Vacciniaviruses, adenoviruses, poxviruses, pseudorabies viruses, andretroviruses. Vectors may also be derived from combinations of thesesources such as those derived from plasmid and bacteriophage geneticelements, e.g., cosmids and phagemids. Appropriate cloning andexpression vectors for prokaryotic and eukaryotic hosts are described inSambrook et al. (1989) Molecular Cloning: A Laboratory Manual 2nd. ed.,Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.

The regulatory sequence may provide constitutive expression in one ormore host cells (i.e., tissue specific) or may provide for inducibleexpression in one or more cell types such as by temperature, nutrientadditive, or exogenous factor such as a hormone or other ligand. Avariety of vectors providing for constitutive and inducible expressionin prokaryotic and eukaryotic hosts are well known to those of ordinaryskill in the art.

The ubiquitin protease polynucleotides can be inserted into the vectornucleic acid by well-known methodology. Generally, the DNA sequence thatwill ultimately be expressed is joined to an expression vector bycleaving the DNA sequence and the expression vector with one or morerestriction enzymes and then ligating the fragments together. Proceduresfor restriction enzyme digestion and ligation are well known to those ofordinary skill in the art.

The vector containing the appropriate polynucleotide can be introducedinto an appropriate host cell for propagation or expression usingwell-known techniques. Bacterial cells include, but are not limited to,E. coli, Streptomyces, and Salmonella typhimurium. Eukaryotic cellsinclude, but are not limited to, yeast, insect cells such as Drosophila,animal cells such as COS and CHO cells, and plant cells.

As described herein, it may be desirable to express the polypeptide as afusion protein. Accordingly, the invention provides fusion vectors thatallow for the production of the ubiquitin protease polypeptides. Fusionvectors can increase the expression of a recombinant protein, increasethe solubility of the recombinant protein, and aid in the purificationof the protein by acting for example as a ligand for affinitypurification. A proteolytic cleavage site may be introduced at thejunction of the fusion moiety so that the desired polypeptide canultimately be separated from the fusion moiety. Proteolytic enzymesinclude, but are not limited to, factor Xa, thrombin, and enterokinase.Typical fusion expression vectors include pGEX (Smith et al. (1988) Gene67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5(Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase(GST), maltose E binding protein, or protein A, respectively, to thetarget recombinant protein. Examples of suitable inducible non-fusion E.coli expression vectors include pTrc (Amann et al. (1988) Gene69:301-315) and pET 11d (Studier et al. (1990) Gene ExpressionTechnology: Methods in Enzymology 185:60-89).

Recombinant protein expression can be maximized in a host bacteria byproviding a genetic background wherein the host cell has an impairedcapacity to proteolytically cleave the recombinant protein. (Gottesman,S. (1990) Gene Expression Technology: Methods in Enzymology 185,Academic Press, San Diego, Calif. 119-128). Alternatively, the sequenceof the polynucleotide of interest can be altered to provide preferentialcodon usage for a specific host cell, for example E. coli. (Wada et al.(1992) Nucleic Acids Res. 20:2111-2118).

The ubiquitin protease polynucleotides can also be expressed byexpression vectors that are operative in yeast. Examples of vectors forexpression in yeast e.g., S. cerevisiae include pYepSec1 (Baldari et al.(1987) EMBO J. 6:229-234), pMFa (Kurjan et al. (1982) Cell 30:933-943),pJRY88 (Schultz et al. (1987) Gene 54:113-123), and pYES2 (InvitrogenCorporation, San Diego, Calif.).

The ubiquitin protease polynucleotides can also be expressed in insectcells using, for example, baculovirus expression vectors. Baculovirusvectors available for expression of proteins in cultured insect cells(e.g., Sf9 cells) include the pAc series (Smith et al. (1983) Mol. Cell.Biol. 3:2156-2165) and the pVL series (Lucklow et al. (1989) Virology170:31-39).

In certain embodiments of the invention, the polynucleotides describedherein are expressed in mammalian cells using mammalian expressionvectors. Examples of mammalian expression vectors include pCDM8 (Seed,B. (1987) Nature 329:840) and pMT2PC (Kaufman et al. (1987) EMBO J.6:187-195).

The expression vectors listed herein are provided by way of example onlyof the well-known vectors available to those of ordinary skill in theart that would be useful to express the ubiquitin proteasepolynucleotides. The person of ordinary skill in the art would be awareof other vectors suitable for maintenance propagation or expression ofthe polynucleotides described herein. These are found for example inSambrook et al. (1989) Molecular Cloning: A Laboratory Manual 2nd, ed.,Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, ColdSpring Harbor, N.Y.

The invention also encompasses vectors in which the nucleic acidsequences described herein are cloned into the vector in reverseorientation, but operably linked to a regulatory sequence that permitstranscription of antisense RNA. Thus, an antisense transcript can beproduced to all, or to a portion, of the polynucleotide sequencesdescribed herein, including both coding and non-coding regions.Expression of this antisense RNA is subject to each of the parametersdescribed above in relation to expression of the sense RNA (regulatorysequences, constitutive or inducible expression, tissue-specificexpression).

The invention also relates to recombinant host cells containing thevectors described herein. Host cells therefore include prokaryoticcells, lower eukaryotic cells such as yeast, other eukaryotic cells suchas insect cells, and higher eukaryotic cells such as mammalian cells.

The recombinant host cells are prepared by introducing the vectorconstructs described herein into the cells by techniques readilyavailable to the person of ordinary skill in the art. These include, butare not limited to, calcium phosphate transfection,DEAE-dextran-mediated transfection, cationic lipid-mediatedtransfection, electroporation, transduction, infection, lipofection, andother techniques such as those found in Sambrook et al. (MolecularCloning: A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory,Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.).

Host cells can contain more than one vector. Thus, different nucleotidesequences can be introduced on different vectors of the same cell.Similarly, the ubiquitin protease polynucleotides can be introducedeither alone or with other polynucleotides that are not related to theubiquitin protease polynucleotides such as those providing trans-actingfactors for expression vectors. When more than one vector is introducedinto a cell, the vectors can be introduced independently, co-introducedor joined to the ubiquitin protease polynucleotide vector.

In the case of bacteriophage and viral vectors, these can be introducedinto cells as packaged or encapsulated virus by standard procedures forinfection and transduction. Viral vectors can be replication-competentor replication-defective. In the case in which viral replication isdefective, replication will occur in host cells providing functions thatcomplement the defects.

Vectors generally include selectable markers that enable the selectionof the subpopulation of cells that contain the recombinant vectorconstructs. The marker can be contained in the same vector that containsthe polynucleotides described herein or may be on a separate vector.Markers include tetracycline or ampicillin-resistance genes forprokaryotic host cells and dihydrofolate reductase or neomycinresistance for eukaryotic host cells. However, any marker that providesselection for a phenotypic trait will be effective.

While the mature proteins can be produced in bacteria, yeast, mammaliancells, and other cells under the control of the appropriate regulatorysequences, cell-free transcription and translation systems can also beused to produce these proteins using RNA derived from the DNA constructsdescribed herein.

Where secretion of the polypeptide is desired, appropriate secretionsignals are incorporated into the vector. The signal sequence can beendogenous to the ubiquitin protease polypeptides or heterologous tothese polypeptides.

Where the polypeptide is not secreted into the medium, the protein canbe isolated from the host cell by standard disruption procedures,including freeze thaw, sonication, mechanical disruption, use of lysingagents and the like. The polypeptide can then be recovered and purifiedby well-known purification methods including ammonium sulfateprecipitation, acid extraction, anion or cationic exchangechromatography, phosphocellulose chromatography, hydrophobic-interactionchromatography, affinity chromatography, hydroxylapatite chromatography,lectin chromatography, or high performance liquid chromatography.

It is also understood that depending upon the host cell in recombinantproduction of the polypeptides described herein, the polypeptides canhave various glycosylation patterns, depending upon the cell, or maybenon-glycosylated as when produced in bacteria. In addition, thepolypeptides may include an initial modified methionine in some cases asa result of a host-mediated process.

Uses of Vectors and Host Cells

It is understood that “host cells” and “recombinant host cells” refernot only to the particular subject cell but also to the progeny orpotential progeny of such a cell. Because certain modifications mayoccur in succeeding generations due to either mutation or environmentalinfluences, such progeny may not, in fact, be identical to the parentcell, but are still included within the scope of the term as usedherein.

The host cells expressing the polypeptides described herein, andparticularly recombinant host cells, have a variety of uses. First, thecells are useful for producing ubiquitin protease proteins orpolypeptides that can be further purified to produce desired amounts ofubiquitin protease protein or fragments. Thus, host cells containingexpression vectors are useful for polypeptide production.

Host cells are also useful for conducting cell-based assays involvingthe ubiquitin protease or ubiquitin protease fragments. Thus, arecombinant host cell expressing a native ubiquitin protease is usefulto assay for compounds that stimulate or inhibit ubiquitin proteasefunction. This includes disappearance of substrate (polyubiquitin,ubiquitinated substrate protein, ubiquitinated substrate remnants),appearance of end product (ubiquitin monomers, polyubiquitin hydrolyzedfrom substrate or substrate remnant, free substrate that has beenrescued by hydrolysis of ubiquitin), general or specific proteinturnover, and the various other molecular functions described hereinthat include, but are not limited to, substrate recognition, substratebinding, subunit association, and interaction with other cellularcomponents. Modulation of gene expression can occur at the level oftranscription or translation.

Host cells are also useful for identifying ubiquitin protease mutants inwhich these functions are affected. If the mutants naturally occur andgive rise to a pathology, host cells containing the mutations are usefulto assay compounds that have a desired effect on the mutant ubiquitinprotease (for example, stimulating or inhibiting function) which may notbe indicated by their effect on the native ubiquitin protease.

Recombinant host cells are also useful for expressing the chimericpolypeptides described herein to assess compounds that activate orsuppress activation or alter specific function by means of aheterologous domain, segment, site, and the like, as disclosed herein.

Further, mutant ubiquitin proteases can be designed in which one or moreof the various functions is engineered to be increased or decreased(e.g., binding to ubiquitin, polyubiquitin, or ubiquitinated proteinsubstrate) and used to augment or replace ubiquitin protease proteins inan individual. Thus, host cells can provide a therapeutic benefit byreplacing an aberrant ubiquitin protease or providing an aberrantubiquitin protease that provides a therapeutic result. In oneembodiment, the cells provide ubiquitin proteases that are abnormallyactive.

In another embodiment, the cells provide ubiquitin proteases that areabnormally inactive. These ubiquitin proteases can compete withendogenous ubiquitin proteases in the individual.

In another embodiment, cells expressing ubiquitin proteases that cannotbe activated, are introduced into an individual in order to compete withendogenous ubiquitin proteases for ubiquitin substrates. For example, inthe case in which excessive ubiquitin substrate or analog is part of atreatment modality, it may be necessary to inactivate this molecule at aspecific point in treatment. Providing cells that compete for themolecule, but which cannot be affected by ubiquitin protease activationwould be beneficial.

Homologously recombinant host cells can also be produced that allow thein situ alteration of endogenous ubiquitin protease polynucleotidesequences in a host cell genome. The host cell includes, but is notlimited to, a stable cell line, cell in vivo, or cloned microorganism.This technology is more fully described in WO 93/09222, WO 91/12650, WO91/06667, U.S. Pat. No. 5,272,071, and U.S. Pat. No. 5,641,670. Briefly,specific polynucleotide sequences corresponding to the ubiquitinprotease polynucleotides or sequences proximal or distal to a ubiquitinprotease gene are allowed to integrate into a host cell genome byhomologous recombination where expression of the gene can be affected.In one embodiment, regulatory sequences are introduced that eitherincrease or decrease expression of an endogenous sequence. Accordingly,a ubiquitin protease can be produced in a cell not normally producingit. Alternatively, increased expression of ubiquitin protease can beeffected in a cell normally producing the protein at a specific level.Further, expression can be decreased or eliminated by introducing aspecific regulatory sequence. The regulatory sequence can beheterologous to the ubiquitin protease protein sequence or can be ahomologous sequence with a desired mutation that affects expression.Alternatively, the entire gene can be deleted. The regulatory sequencecan be specific to the host cell or capable of functioning in more thanone cell type. Still further, specific mutations can be introduced intoany desired region of the gene to produce mutant ubiquitin proteaseproteins. Such mutations could be introduced, for example, into thespecific functional regions such as the ligand-binding site.

In one embodiment, the host cell can be a fertilized oocyte or embryonicstem cell that can be used to produce a transgenic animal containing thealtered ubiquitin protease gene. Alternatively, the host cell can be astem cell or other early tissue precursor that gives rise to a specificsubset of cells and can be used to produce transgenic tissues in ananimal. See also Thomas et al., Cell 51:503 (1987) for a description ofhomologous recombination vectors. The vector is introduced into anembryonic stem cell line (e.g., by electroporation) and cells in whichthe introduced gene has homologously recombined with the endogenousubiquitin protease gene is selected (see e.g., Li, E. et al. (1992) Cell69:915). The selected cells are then injected into a blastocyst of ananimal (e.g., a mouse) to form aggregation chimeras (see e.g., Bradley,A. in Teratocarcinomas and Embryonic Stem Cells: A Practical Approach,E. J. Robertson, ed. (IRL, Oxford, 1987) pp. 113-152). A chimeric embryocan then be implanted into a suitable pseudopregnant female fosteranimal and the embryo brought to term. Progeny harboring thehomologously recombined DNA in their germ cells can be used to breedanimals in which all cells of the animal contain the homologouslyrecombined DNA by germline transmission of the transgene. Methods forconstructing homologous recombination vectors and homologous recombinantanimals are described further in Bradley, A. (1991) Current Opinion inBiotechnology 2:823-829 and in PCT International Publication Nos. WO90/11354; WO 91/01140; and WO 93/04169.

The genetically engineered host cells can be used to produce non-humantransgenic animals. A transgenic animal is preferably a mammal, forexample a rodent, such as a rat or mouse, in which one or more of thecells of the animal include a transgene. A transgene is exogenous DNAwhich is integrated into the genome of a cell from which a transgenicanimal develops and which remains in the genome of the mature animal inone or more cell types or tissues of the transgenic animal. Theseanimals are useful for studying the function of a ubiquitin proteaseprotein and identifying and evaluating modulators of ubiquitin proteaseprotein activity.

Other examples of transgenic animals include non-human primates, sheep,dogs, cows, goats, chickens, and amphibians.

In one embodiment, a host cell is a fertilized oocyte or an embryonicstem cell into which ubiquitin protease polynucleotide sequences havebeen introduced.

A transgenic animal can be produced by introducing nucleic acid into themale pronuclei of a fertilized oocyte, e.g., by microinjection,retroviral infection, and allowing the oocyte to develop in apseudopregnant female foster animal. Any of the ubiquitin proteasenucleotide sequences can be introduced as a transgene into the genome ofa non-human animal, such as a mouse.

Any of the regulatory or other sequences useful in expression vectorscan form part of the transgenic sequence. This includes intronicsequences and polyadenylation signals, if not already included. Atissue-specific regulatory sequence(s) can be operably linked to thetransgene to direct expression of the ubiquitin protease protein toparticular cells.

Methods for generating transgenic animals via embryo manipulation andmicroinjection, particularly animals such as mice, have becomeconventional in the art and are described, for example, in U.S. Pat.Nos. 4,736,866 and 4,870,009, both by Leder et al., U.S. Pat. No.4,873,191 by Wagner et al. and in Hogan, B., Manipulating the MouseEmbryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.,1986). Similar methods are used for production of other transgenicanimals. A transgenic founder animal can be identified based upon thepresence of the transgene in its genome and/or expression of transgenicmRNA in tissues or cells of the animals. A transgenic founder animal canthen be used to breed additional animals carrying the transgene.Moreover, transgenic animals carrying a transgene can further be bred toother transgenic animals carrying other transgenes. A transgenic animalalso includes animals in which the entire animal or tissues in theanimal have been produced using the homologously recombinant host cellsdescribed herein.

In another embodiment, transgenic non-human animals can be producedwhich contain selected systems, which allow for regulated expression ofthe transgene. One example of such a system is the cre/loxP recombinasesystem of bacteriophage PI. For a description of the cre/loxPrecombinase system, see, e.g., Lakso et al. (1992) PNAS 89:6232-6236.Another example of a recombinase system is the FLP recombinase system ofS. cerevisiae (O'Gorman et al. (1991) Science 251:1351-1355. If acre/loxP recombinase system is used to regulate expression of thetransgene, animals containing transgenes encoding both the Crerecombinase and a selected protein is required. Such animals can beprovided through the construction of “double” transgenic animals, e.g.,by mating two transgenic animals, one containing a transgene encoding aselected protein and the other containing a transgene encoding arecombinase.

Clones of the non-human transgenic animals described herein can also beproduced according to the methods described in Wilmut et al. (1997)Nature 385:810-813 and PCT International Publication Nos. WO 97/07668and WO 97/07669. In brief, a cell, e.g., a somatic cell, from thetransgenic animal can be isolated and induced to exit the growth cycleand enter G₀ phase. The quiescent cell can then be fused, e.g., throughthe use of electrical pulses, to an enucleated oocyte from an animal ofthe same species from which the quiescent cell is isolated. Thereconstructed oocyte is then cultured such that it develops to morula orblastocyst and then transferred to a pseudopregnant female fosteranimal. The offspring born of this female foster animal will be a cloneof the animal from which the cell, e.g., the somatic cell, is isolated.

Transgenic animals containing recombinant cells that express thepolypeptides described herein are useful to conduct the assays describedherein in an in vivo context. Accordingly, the various physiologicalfactors that are present in vivo and that could affect, for example,binding, activation, and protein turnover, may not be evident from invitro cell-free or cell-based assays. Accordingly, it is useful toprovide non-human transgenic animals to assay in vivo ubiquitin proteasefunction, including substrate interaction, the effect of specific mutantubiquitin proteases on ubiquitin protease function and substrateinteraction, and the effect of chimeric ubiquitin proteases. It is alsopossible to assess the effect of null mutations, that is mutations thatsubstantially or completely eliminate one or more ubiquitin proteasefunctions.

In general, methods for producing transgenic animals include introducinga nucleic acid sequence according to the present invention, the nucleicacid sequence capable of expressing the receptor protein in a transgenicanimal, into a cell in culture or in vivo. When introduced in vivo, thenucleic acid is introduced into an intact organism such that one or morecell types and, accordingly, one or more tissue types, express thenucleic acid encoding the receptor protein. Alternatively, the nucleicacid can be introduced into virtually all cells in an organism bytransfecting a cell in culture, such as an embryonic stem cell, asdescribed herein for the production of transgenic animals, and this cellcan be used to produce an entire transgenic organism. As described, in afurther embodiment, the host cell can be a fertilized oocyte. Such cellsare then allowed to develop in a female foster animal to produce thetransgenic organism.

Pharmaceutical Compositions

The ubiquitin protease nucleic acid molecules, protein modulators of theprotein, and antibodies (also referred to herein as “active compounds”)can be incorporated into pharmaceutical compositions suitable foradministration to a subject, e.g., a human. Such compositions typicallycomprise the nucleic acid molecule, protein, modulator, or antibody anda pharmaceutically acceptable carrier.

The term “administer” is used in its broadest sense and includes anymethod of introducing the compositions of the present invention into asubject. This includes producing polypeptides or polynucleotides in vivoas by transcription or translation, in vivo, of polynucleotides thathave been exogenously introduced into a subject. Thus, polypeptides ornucleic acids produced in the subject from the exogenous compositionsare encompassed in the term “administer.”

As used herein the language “pharmaceutically acceptable carrier” isintended to include any and all solvents, dispersion media, coatings,antibacterial and antifungal agents, isotonic and absorption delayingagents, and the like, compatible with pharmaceutical administration. Theuse of such media and agents for pharmaceutically active substances iswell known in the art. Except insofar as any conventional media or agentis incompatible with the active compound, such media can be used in thecompositions of the invention. Supplementary active compounds can alsobe incorporated into the compositions. A pharmaceutical composition ofthe invention is formulated to be compatible with its intended route ofadministration. Examples of routes of administration include parenteral,e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation),transdermal (topical), transmucosal, and rectal administration.Solutions or suspensions used for parenteral, intradermal, orsubcutaneous application can include the following components: a sterilediluent such as water for injection, saline solution, fixed oils,polyethylene glycols, glycerine, propylene glycol or other syntheticsolvents; antibacterial agents such as benzyl alcohol or methylparabens; antioxidants such as ascorbic acid or sodium bisulfite;chelating agents such as ethylenediaminetetraacetic acid; buffers suchas acetates, citrates or phosphates and agents for the adjustment oftonicity such as sodium chloride or dextrose. pH can be adjusted withacids or bases, such as hydrochloric acid or sodium hydroxide. Theparenteral preparation can be enclosed in ampules, disposable syringesor multiple dose vials made of glass or plastic.

Pharmaceutical compositions suitable for injectable use include sterileaqueous solutions (where water soluble) or dispersions and sterilepowders for the extemporaneous preparation of sterile injectablesolutions or dispersion. For intravenous administration, suitablecarriers include physiological saline, bacteriostatic water, CremophorEL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In allcases, the composition must be sterile and should be fluid to the extentthat easy syringability exists. It must be stable under the conditionsof manufacture and storage and must be preserved against thecontaminating action of microorganisms such as bacteria and fungi. Thecarrier can be a solvent or dispersion medium containing, for example,water, ethanol, polyol (for example, glycerol, propylene glycol, andliquid polyethylene glycol, and the like), and suitable mixturesthereof. The proper fluidity can be maintained, for example, by the useof a coating such as lecithin, by the maintenance of the requiredparticle size in the case of dispersion and by the use of surfactants.Prevention of the action of microorganisms can be achieved by variousantibacterial and antifungal agents, for example, parabens,chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In manycases, it will be preferable to include isotonic agents, for example,sugars, polyalcohols such as mannitol, sorbitol, sodium chloride in thecomposition. Prolonged absorption of the injectable compositions can bebrought about by including in the composition an agent which delaysabsorption, for example, aluminum monostearate and gelatin.

Sterile injectable solutions can be prepared by incorporating the activecompound (e.g., a ubiquitin protease protein or anti-ubiquitin proteaseantibody) in the required amount in an appropriate solvent with one or acombination of ingredients enumerated above, as required, followed byfiltered sterilization. Generally, dispersions are prepared byincorporating the active compound into a sterile vehicle which containsa basic dispersion medium and the required other ingredients from thoseenumerated above. In the case of sterile powders for the preparation ofsterile injectable solutions, the preferred methods of preparation arevacuum drying and freeze-drying which yields a powder of the activeingredient plus any additional desired ingredient from a previouslysterile-filtered solution thereof.

Oral compositions generally include an inert diluent or an ediblecarrier. They can be enclosed in gelatin capsules or compressed intotablets. For oral administration, the agent can be contained in entericforms to survive the stomach or further coated or mixed to be releasedin a particular region of the GI tract by known methods. For the purposeof oral therapeutic administration, the active compound can beincorporated with excipients and used in the form of tablets, troches,or capsules. Oral compositions can also be prepared using a fluidcarrier for use as a mouthwash, wherein the compound in the fluidcarrier is applied orally and swished and expectorated or swallowed.Pharmaceutically compatible binding agents, and/or adjuvant materialscan be included as part of the composition. The tablets, pills,capsules, troches and the like can contain any of the followingingredients, or compounds of a similar nature: a binder such asmicrocrystalline cellulose, gum tragacanth or gelatin; an excipient suchas starch or lactose, a disintegrating agent such as alginic acid,Primogel, or corn starch; a lubricant such as magnesium stearate orSterotes; a glidant such as colloidal silicon dioxide; a sweeteningagent such as sucrose or saccharin; or a flavoring agent such aspeppermint, methyl salicylate, or orange flavoring.

For administration by inhalation, the compounds are delivered in theform of an aerosol spray from pressured container or dispenser, whichcontains a suitable propellant, e.g., a gas such as carbon dioxide, or anebulizer.

Systemic administration can also be by transmucosal or transdermalmeans. For transmucosal or transdermal administration, penetrantsappropriate to the barrier to be permeated are used in the formulation.Such penetrants are generally known in the art, and include, forexample, for transmucosal administration, detergents, bile salts, andfusidic acid derivatives. Transmucosal administration can beaccomplished through the use of nasal sprays or suppositories. Fortransdermal administration, the active compounds are formulated intoointments, salves, gels, or creams as generally known in the art.

The compounds can also be prepared in the form of suppositories (e.g.,with conventional suppository bases such as cocoa butter and otherglycerides) or retention enemas for rectal delivery.

In one embodiment, the active compounds are prepared with carriers thatwill protect the compound against rapid elimination from the body, suchas a controlled release formulation, including implants andmicroencapsulated delivery systems. Biodegradable, biocompatiblepolymers can be used, such as ethylene vinyl acetate, polyanhydrides,polyglycolic acid, collagen, polyorthoesters, and polylactic acid.Methods for preparation of such formulations will be apparent to thoseskilled in the art. The materials can also be obtained commercially fromAlza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions(including liposomes targeted to infected cells with monoclonalantibodies to viral antigens) can also be used as pharmaceuticallyacceptable carriers. These can be prepared according to methods known tothose skilled in the art, for example, as described in U.S. Pat. No.4,522,811.

It is especially advantageous to formulate oral or parenteralcompositions in dosage unit form for ease of administration anduniformity of dosage. “Dosage unit form” as used herein refers tophysically discrete units suited as unitary dosages for the subject tobe treated; each unit containing a predetermined quantity of activecompound calculated to produce the desired therapeutic effect inassociation with the required pharmaceutical carrier. The specificationfor the dosage unit forms of the invention are dictated by and directlydependent on the unique characteristics of the active compound and theparticular therapeutic effect to be achieved, and the limitationsinherent in the art of compounding such an active compound for thetreatment of individuals.

The nucleic acid molecules of the invention can be inserted into vectorsand used as gene therapy vectors. Gene therapy vectors can be deliveredto a subject by, for example, intravenous injection, localadministration (U.S. Pat. No. 5,328,470) or by stereotactic injection(see e.g., Chen et al. (1994) PNAS 91:3054-3057). The pharmaceuticalpreparation of the gene therapy vector can include the gene therapyvector in an acceptable diluent, or can comprise a slow release matrixin which the gene delivery vehicle is imbedded. Alternatively, where thecomplete gene delivery vector can be produced intact from recombinantcells, e.g., retroviral vectors, the pharmaceutical preparation caninclude one or more cells which produce the gene delivery system.

The pharmaceutical compositions can be included in a container, pack, ordispenser together with instructions for administration.

As defined herein, a therapeutically effective amount of protein orpolypeptide (i.e., an effective dosage) ranges from about 0.001 to 30mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, morepreferably about 0.1 to 20 mg/kg body weight, and even more preferablyabout 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6mg/kg body weight.

The skilled artisan will appreciate that certain factors may influencethe dosage required to effectively treat a subject, including but notlimited to the severity of the disease or disorder, previous treatments,the general health and/or age of the subject, and other diseasespresent. Moreover, treatment of a subject with a therapeuticallyeffective amount of a protein, polypeptide, or antibody can include asingle treatment or, preferably, can include a series of treatments. Ina preferred example, a subject is treated with antibody, protein, orpolypeptide in the range of between about 0.1 to 20 mg/kg body weight,one time per week for between about 1 to 10 weeks, preferably between 2to 8 weeks, more preferably between about 3 to 7 weeks, and even morepreferably for about 4, 5, or 6 weeks. It will also be appreciated thatthe effective dosage of antibody, protein, or polypeptide used fortreatment may increase or decrease over the course of a particulartreatment. Changes in dosage may result and become apparent from theresults of diagnostic assays as described herein.

The present invention encompasses agents which modulate expression oractivity. An agent may, for example, be a small molecule. For example,such small molecules include, but are not limited to, peptides,peptidomimetics, amino acids, amino acid analogs, polynucleotides,polynucleotide analogs, nucleotides, nucleotide analogs, organic orinorganic compounds (i.e., including heteroorganic and organometalliccompounds) having a molecular weight less than about 10,000 grams permole, organic or inorganic compounds having a molecular weight less thanabout 5,000 grams per mole, organic or inorganic compounds having amolecular weight less than about 1,000 grams per mole, organic orinorganic compounds having a molecular weight less than about 500 gramsper mole, and salts, esters, and other pharmaceutically acceptable formsof such compounds.

It is understood that appropriate doses of small molecule agents dependsupon a number of factors within the purview of the ordinarily skilledphysician, veterinarian, or researcher. The dose(s) of the smallmolecule will vary, for example, depending upon the identity, size, andcondition of the subject or sample being treated, further depending uponthe route by which the composition is to be administered, if applicable,and the effect which the practitioner desires the small molecule to haveupon the nucleic acid or polypeptide of the invention. Exemplary dosesinclude milligram or microgram amounts of the small molecule perkilogram of subject or sample weight (e.g., about 1 microgram perkilogram to about 500 milligrams per kilogram, about 100 micrograms perkilogram to about 5 milligrams per kilogram, or about 1 microgram perkilogram to about 50 micrograms per kilogram. It is furthermoreunderstood that appropriate doses of a small molecule depend upon thepotency of the small molecule with respect to the expression or activityto be modulated. Such appropriate doses may be determined using theassays described herein. When one or more of these small molecules is tobe administered to an animal (e.g., a human) in order to modulateexpression or activity of a polypeptide or nucleic acid of theinvention, a physician, veterinarian, or researcher may, for example,prescribe a relatively low dose at first, subsequently increasing thedose until an appropriate response is obtained. In addition, it isunderstood that the specific dose level for any particular animalsubject will depend upon a variety of factors including the activity ofthe specific compound employed, the age, body weight, general health,gender, and diet of the subject, the time of administration, the routeof administration, the rate of excretion, any drug combination, and thedegree of expression or activity to be modulated.

This invention may be embodied in many different forms and should not beconstrued as limited to the embodiments set forth herein; rather, theseembodiments are provided so that this disclosure will fully convey theinvention to those skilled in the art. Many modifications and otherembodiments of the invention will come to mind in one skilled in the artto which this invention pertains having the benefit of the teachingspresented in the foregoing description. Although specific terms areemployed, they are used as in the art unless otherwise indicated.

CHAPTER 4 18891, a Novel Human Lipase BACKGROUND OF THE INVENTION

Lipases are indispensable for the bioconversion of lipids within anorganism through the catalysis of a variety of reactions that includehydrolysis, alcoholysis, acidolysis, esterfication and aminolysis. Inhumans, several lipases have been identified which possess lipolyticactivities that regulate levels of triglycerides and cholesterol in thebody. Enzymes from this superfamily include lipoprotein lipase (LPL),hepatic lipase (HL), and pancreatic lipase (PL). While all three enzymeshydrolyze lipid emulsions and have similar aqueous-lipid interfacialcatalytic activities, they each possess unique properties andphysiological functions. All three enzymes act preferentially on thesn-1 and sn-3 bonds of triglycerides, to release fatty acids from theglycerol backbone (Dolphin et al. (1992) Structure and Function ofApolipoproteins, Rosseneu, M. (ed) CRC Press, Inc, Boca Ratan, 295-362).However, while PL completes the hydrolysis of alimentary triglycerides,the LPL and HL enzymes hydrolyze triglycerides found in circulatinglipoproteins.

Due to the insolubility of lipids in water, the plasma transportscomplex lipids among various tissues as components of lipoproteins. Eachlipoprotein contains a neutral lipid core composed of triacylglyceroland/or a cholesterol ester. Surrounding the core is a layer of proteins,phospholipids, and cholesterol. The proteins associated with thelipoprotein comprise a class of proteins referred to as apoproteins(apo). Based on apoprotein composition and density, lipoproteins havebeen classified into five major types that include chylomicrons,high-density lipoproteins (HDL), intermediate-density lipoproteins(IDL), low-density lipoproteins (LDL), and very-low density lipoproteins(VLDL).

Lipoprotein lipase (LPL) is the major enzyme responsible for thehydrolysis of triglyceride molecules present in circulatinglipoproteins. LPL is associated with the luminal side of capillaries andarteries through an interaction with heparin-sulfate chains ofproteoglycans and/or by glycerol phosphatidylinostintol. With the helpof the activator apo CII, LPL hydrolyzes triglycerides of lipoproteinsto produce free fatty acids. Muscle and adipose tissue assimilate thesefatty acids. Alternatively, the fatty acids can be bound to albumin andtransported to other tissues. As the lipase hydrolyzes the triglyceridesof the lipoprotein, the particles become smaller and are often referredto as lipoprotein remnants. Within the plasma compartment, LPL convertschylomicrons to remnants and begins the cascade requirements forconversion of VLDL to LDL.

In its active form, LPL is a glycosylated non-covalent homodimer, witheach subunit containing a binding site for heparin and apolipoprotein(apo) CII, an activator protein required for LPL activity. In additionto hydrolysis of triglycerides, LPL can hydrolyze a variety of othersubstrates, for example, long and short chain glycerides, phospholipidsand various synthetic substrates (Olivecrona et al. (1987) LipoproteinLipase Borensztajn, J. (ed) Evener Publisher, Inc., pages 15-58).

In addition to the lypolytic activity of LPL described above, LPL playsadditional roles in lipid metabolism. After sufficient hydrolysis,lipoprotein lipase is released from proteoglycans and travels with theremnants of the chylomicrons or VLDL. In the plasma LPL may then act tosequester the remnant particles on surface proteoglycans. SubsequentlyLPL can act as a ligand for receptors such as the LDL receptor,LDL-receptor related protein, gp330, or the VLDL receptor. Thisinteraction with the cell surface receptor facilitates the uptake anddegradation of plasma lipoproteins by cells (Williams et al. (1992) J.Biol. Chem. 267:13284-13292 and Nykjaer et al. (1993) J. Biol. Chem.268:15048-15055).

Furthermore, LPL expressed in macrophages has been implicated in thecellular uptake of lipoprotein lipids and fat soluble vitamins, thedegradation of lipid-containing pathogens and cell debris, and thecreation of fatty acids for the energy requirements of the cell.

Disruption of LPL activity has also been implicated in other biologicalfunctions including, for example, enhanced oxidative stress in bloodcells, increased fluidity of the membrane components of these cells andincreases the susceptibility of their mitochondrial DNA to structuralalterations (Ven Murthy et al. (1996) Acta Biochimica Polonica43:227-40).

Hepatic lipase (HL) has functions in lipid metabolism similar to thoseof LPL. HL is located on the surface of liver sinusoids throughglycosaminoglycan links where it interacts with lipoproteins andhydrolyzes triglycerides into free fatty acids. Unlike LPL, the activityof HL does not require an activator, but its activity may be stimulatedby apo E. Thus, the preferred substrates of HL are the triglycerides ofapo E-containing lipoproteins, such as chylomicron remnants, IDL, andHDL. Furthermore, the actions of HL on HDL are important in the reversecholesterol transport process, a mechanism thought to reduce excessaccumulation of cholesterol in hepatic tissue.

Like LPL, hepatic lipase has also been implicated in the uptake anddegradation of lipoprotein in the hepatic tissue. Evidence suggests thatHL may interact with cell surface receptors, such as those describedabove, and direct hepatic cellular uptake of lipoproteins andlipoprotein remnants. (Chappell et al. (1998) Progress in Lipid Research37: 363-422).

In its active form, HL exists as a monomer comprising both triglyceridelipase activity and phospholipase activity. As with LPL, treatment withheparin, results in the release of HL from the cell surfaces. Whileglycosylation plays an important role in secretion and affinity of LPL,it does not seem to be crucial for HL activity.

Pancreatic lipase (PL) is synthesized in acinar cells of the exocrinepancreas along with its protein activator, colipase. The pancreatic ducttransports glycosylated PL and colipase into the duodenum. PL does notbecome anchored to membrane surfaces like LPL or HL. Instead, the freemonomer of PL interacts with colipase which helps to anchor the PL tothe lipid-water interface where the enzyme completes the hydrolysis ofalimentary triglycerides.

In summary, lipases play a key role in lipid metabolism by regulatinglevels of cholesterol and triglycerides and therefore influence majormetabolic processes including effects on lipid and lipoproteinconcentrations, energy homeostasis, body weight, and bodycomposition-parameters. Each of these metabolic consequences has beenassociated with common diseases, such as, hypertriglyceridemia,atherosclerosis, obesity and various other disease states describedfurther below.

Accordingly, lipases are a major target for drug action and development.Thus, it is valuable to the field of pharmaceutical development toidentify and characterize previously unknown lipases. The presentinvention advances the state of the art by providing a previouslyunidentified human lipase enzyme.

SUMMARY OF THE INVENTION

It is an object of the invention to identify novel lipases.

It is a further object of the invention to provide novel lipasepolypeptides that are useful as reagents or targets in assays applicableto treatment and diagnosis of lipase-mediated or -related disorders,especially disorders mediated by or related to lipase enzymes.

It is a further object of the invention to provide polynucleotidescorresponding to the novel lipase polypeptides that are useful astargets and reagents in assays applicable to treatment and diagnosis oflipase or lipase-mediated or -related disorders and useful for producingnovel lipase polypeptides by recombinant methods.

A specific object of the invention is to identify compounds that act asagonists and antagonists and modulate the expression of the novellipase.

A further specific object of the invention is to provide compounds thatmodulate expression of the lipase for treatment and diagnosis of lipaseand lipase-related disorders.

The invention is thus based on the identification of a novel humanlipase. The amino acid sequence is shown in SEQ ID NO:17. The nucleotidesequence is shown in SEQ ID NO:18.

The invention provides isolated lipase polypeptides, including apolypeptide having the amino acid sequence shown in SEQ ID NO:17 or theamino acid sequence encoded by the cDNA deposited as ATCC Patent DepositNo. PTA-1915 on May 24, 2000 (“the deposited cDNA”).

The invention also provides isolated lipase nucleic acid moleculeshaving the sequence shown in SEQ ID NO:18 or in the deposited cDNA.

The invention also provides variant polypeptides having an amino acidsequence that is substantially homologous to the amino acid sequenceshown in SEQ ID NO:17 or encoded by the deposited cDNA.

The invention also provides variant nucleic acid sequences that aresubstantially homologous to the nucleotide sequence shown in SEQ IDNO:18 or in the deposited cDNA.

The invention also provides fragments of the polypeptide shown in SEQ IDNO:17 and nucleotide sequence shown in SEQ ID NO:18, as well assubstantially homologous fragments of the polypeptide or nucleic acid.

The invention further provides nucleic acid constructs comprising thenucleic acid molecules described herein. In a preferred embodiment, thenucleic acid molecules of the invention are operatively linked to aregulatory sequence.

The invention also provides vectors and host cells for expressing thelipase nucleic acid molecules and polypeptides, and particularlyrecombinant vectors and host cells.

The invention also provides methods of making the vectors and host cellsand methods for using them to produce the lipase nucleic acid moleculesand polypeptides.

The invention also provides antibodies or antigen-binding fragmentsthereof that selectively bind the lipase polypeptides and fragments.

The invention also provides methods of screening for compounds thatmodulate expression or activity of the lipase polypeptides or nucleicacid (RNA or DNA).

The invention also provides a process for modulating lipase polypeptideor nucleic acid expression or activity, especially using the screenedcompounds. Modulation may be used to treat conditions related toaberrant activity or expression of the lipase polypeptides or nucleicacids or aberrant activity resulting in the alteredaccumulation/degradation of lipids.

The invention also provides assays for determining the activity of orthe presence or absence of the lipase polypeptides or nucleic acidmolecules in a biological sample, including for disease diagnosis.

The invention also provides assays for determining the presence of amutation in the polypeptides or nucleic acid molecules, including fordisease diagnosis.

In still a further embodiment, the invention provides a computerreadable means containing the nucleotide and/or amino acid sequences ofthe nucleic acids and polypeptides of the invention, respectively.

DETAILED DESCRIPTION OF THE INVENTION

The present inventions now will be described more fully hereinafter withreference to the accompanying drawings, in which some, but not allembodiments of the invention are shown. Indeed, these inventions may beembodied in many different forms and should not be construed as limitedto the embodiments set forth herein; rather, these embodiments areprovided so that this disclosure will satisfy applicable legalrequirements. Like numbers refer to like elements throughout.

Many modifications and other embodiments of the inventions set forthherein will come to mind to one skilled in the art to which theseinventions pertain having the benefit of the teachings presented in theforegoing descriptions and the associated drawings. Therefore, it is tobe understood that the inventions are not to be limited to the specificembodiments disclosed and that modifications and other embodiments areintended to be included within the scope of the appended claims.Although specific terms are employed herein, they are used in a genericand descriptive sense only and not for purposes of limitation.

Polypeptides

The invention is based on the identification of a novel human lipase.Specifically, an expressed sequence tag (EST) was selected based onhomology to lipase sequences. This EST was used to design primers basedon sequences that it contains and used to identify a cDNA from a brainlibrary. Positive clones were sequenced and the overlapping fragmentswere assembled. Analysis of the assembled sequence revealed that thecloned cDNA molecule encodes a lipase.

The invention thus relates to a novel lipase having the deduced aminoacid sequence shown in FIGS. 39A-39B (SEQ ID NO:17) or having the aminoacid sequence encoded by the cDNA insert of the plasmid deposited withAmerican Type Culture Collection (ATCC), 10801 University Boulevard,Manassas, Va. 20110-2209, on May 24, 2000 and assigned Patent DepositNumber PTA-1915.

The deposits will be maintained under the terms of the Budapest Treatyon the International Recognition of the Deposit of Microorganisms. Thedeposits are provided as a convenience to those of skill in the art andare not an admission that a deposit is required under 35 U.S.C. §112.The deposited sequences, as well as the polypeptides encoded by thesequences, are incorporated herein by reference and controls in theevent of any conflict, such as a sequencing error, with description inthis application.

“Lipase polypeptide” or “lipase protein” refers to the polypeptide inSEQ ID NO:17 or encoded by the deposited cDNA. The term “lipase protein”or “lipase polypeptide”, however, further includes the numerous variantsdescribed herein, as well as fragments derived from the full-lengthlipase and variants.

Tissues and/or cells in which the lipase is expressed include, but arenot limited to those shown in FIGS. 43, 44, and 45. Tissues in which thegene is highly expressed include liver, fetal liver, breast, brain,fetal kidney, and testis. Moderate expression occurs in prostate,skeletal muscle, colon, kidney, and thyroid. Lower positive expressionoccurs in heart, fetal heart, small intestine, spleen, lung, ovary,vein, aorta, placenta, osteoblasts, cervix, esophagus, thymus, tonsil,and lymph node. The lipase is also expressed in malignant breast, lung,and colon tissue and in liver metastases derived from malignant colonictissues. Hence, the lipase is relevant to disorders involving thetissues in which it is expressed.

The present invention thus provides an isolated or purified lipasepolypeptide and variants and fragments thereof.

Based on Clustal W sequence alignment, highest homology was shown tolipase 1 precursor (triacylglycerol lipase) from Psychrobacter immobilis(Ace. No. Q02104).

As used herein, a polypeptide is said to be “isolated” or “purified”when it is substantially free of cellular material when it is isolatedfrom recombinant and non-recombinant cells, or free of chemicalprecursors or other chemicals when it is chemically synthesized. Apolypeptide, however, can be joined to another polypeptide with which itis not normally associated in a cell and still be considered “isolated”or “purified.”

The lipase polypeptides can be purified to homogeneity. It isunderstood, however, that preparations in which the polypeptide is notpurified to homogeneity are useful and considered to contain an isolatedform of the polypeptide. The critical feature is that the preparationallows for the desired function of the polypeptide, even in the presenceof considerable amounts of other components. Thus, the inventionencompasses various degrees of purity.

In one embodiment, the language “substantially free of cellularmaterial” includes preparations of the lipase having less than about 30%(by dry weight) other proteins (i.e., contaminating protein), less thanabout 20% other proteins, less than about 10% other proteins, or lessthan about 5% other proteins. When the polypeptide is recombinantlyproduced, it can also be substantially free of culture medium, i.e.,culture medium represents less than about 20%, less than about 10%, orless than about 5% of the volume of the protein preparation.

A lipase polypeptide is also considered to be isolated when it is partof a membrane preparation or is purified and then reconstituted withmembrane vesicles or liposomes.

The language “substantially free of chemical precursors or otherchemicals” includes preparations of the lipase polypeptide in which itis separated from chemical precursors or other chemicals that areinvolved in its synthesis. In one embodiment, the language“substantially free of chemical precursors or other chemicals” includespreparations of the polypeptide having less than about 30% (by dryweight) chemical precursors or other chemicals, less than about 20%chemical precursors or other chemicals, less than about 10% chemicalprecursors or other chemicals, or less than about 5% chemical precursorsor other chemicals.

In one embodiment, the lipase polypeptide comprises the amino acidsequence shown in SEQ ID NO:17 or the mature form of the polypeptide.However, the invention also encompasses sequence variants. Variantsinclude a substantially homologous protein encoded by the same geneticlocus in an organism, i.e., an allelic variant.

Variants also encompass proteins derived from other genetic loci in anorganism, but having substantial homology to the lipase of SEQ ID NO:17.Variants also include proteins substantially homologous to the lipasebut derived from another organism, i.e., an ortholog. Variants alsoinclude proteins that are substantially homologous to the lipase thatare produced by chemical synthesis. Variants also include proteins thatare substantially homologous to the lipase that are produced byrecombinant methods. It is understood, however, that variants excludeany amino acid sequences disclosed prior to the invention.

As used herein, two proteins (or a region of the proteins) aresubstantially homologous when the amino acid sequences are at leastabout 70-75%, typically at least about 80-85%, and most typically atleast about 90-95% or more homologous. A substantially homologous aminoacid sequence, according to the present invention, will be encoded by anucleic acid sequence hybridizing to the nucleic acid sequence, orportion thereof, of the sequence shown in SEQ ID NO:18 under stringentconditions as more fully described below.

To determine the percent identity of two amino acid sequences or of twonucleic acid sequences, the sequences are aligned for optimal comparisonpurposes (e.g., gaps can be introduced in one or both of a first and asecond amino acid or nucleic acid sequence for optimal alignment andnon-homologous sequences can be disregarded for comparison purposes). Ina preferred embodiment, the length of a reference sequence aligned forcomparison purposes is at least 30%, preferably at least 40%, morepreferably at least 50%, even more preferably at least 60%, and evenmore preferably at least 70%, 80%, or 90% of the length of the referencesequence (i.e., 100%=the entire coding sequence). The amino acidresidues or nucleotides at corresponding amino acid positions ornucleotide positions are then compared. When a position in the firstsequence is occupied by the same amino acid residue or nucleotide as thecorresponding position in the second sequence, then the molecules areidentical at that position (as used herein amino acid or nucleic acid“identity” is equivalent to amino acid or nucleic acid “homology”). Thepercent identity between the two sequences is a function of the numberof identical positions shared by the sequences, taking into account thenumber of gaps, and the length of each gap, which need to be introducedfor optimal alignment of the two sequences.

The invention also encompasses polypeptides having a lower degree ofidentity but having sufficient similarity so as to perform one or moreof the same functions performed by the lipase. Similarity is determinedby conserved amino acid substitution. Such substitutions are those thatsubstitute a given amino acid in a polypeptide by another amino acid oflike characteristics. Conservative substitutions are likely to bephenotypically silent. Typically seen as conservative substitutions arethe replacements, one for another, among the aliphatic amino acids Ala,Val, Leu, and Ile; interchange of the hydroxyl residues Ser and Thr,exchange of the acidic residues Asp and Glu, substitution between theamide residues Asn and Gln, exchange of the basic residues Lys and Argand replacements among the aromatic residues Phe, Tyr. Guidanceconcerning which amino acid changes are likely to be phenotypicallysilent are found in Bowie et al., Science 247:1306-1310 (1990).

TABLE 1 Conservative Amino Acid Substitutions. Aromatic PhenylalanineTryptophan Tyrosine Hydrophobic Leucine Isoleucine Valine PolarGlutamine Asparagine Basic Arginine Lysine Histidine Acidic AsparticAcid Glutamic Acid Small Alanine Serine Threonine Methionine Glycine

The comparison of sequences and determination of percent identity andsimilarity between two sequences can be accomplished using amathematical algorithm. (Computational Molecular Biology, Lesk, A. M.,ed., Oxford University Press, New York, 1988; Biocomputing: Informaticsand Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993;Computer Analysis of Sequence Data, Part 1, Griffin, A. M., and Griffin,H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis inMolecular Biology, von Heinje, G., Academic Press, 1987; and SequenceAnalysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press,New York, 1991).

A preferred, non-limiting example of such a mathematical algorithm isdescribed in Karlin et al. (1993) Proc. Natl. Acad. Sci. USA90:5873-5877. Such an algorithm is incorporated into the NBLAST andXBLAST programs (version 2.0) as described in Altschul et al. (1997)Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLASTprograms, the default parameters of the respective programs (e.g.,NBLAST) can be used. See www.ncbi.nlm.nih.gov. In one embodiment,parameters for sequence comparison can be set at score=100,wordlength=12, or can be varied (e.g., W=5 or W=20).

In a preferred embodiment, the percent identity between two amino acidsequences is determined using the Needleman et al. (1970) (J. Mol. Biol.48:444-453) algorithm which has been incorporated into the GAP programin the GCG software package (available at www.gcg.com), using either aBLOSUM 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10,8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet anotherpreferred embodiment, the percent identity between two nucleotidesequences is determined using the GAP program in the GCG softwarepackage (Devereux et al. (1984) Nucleic Acids Res. 12(1):387) (availableat www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40,50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6.

Another preferred, non-limiting example of a mathematical algorithmutilized for the comparison of sequences is the algorithm of Myers andMiller, CABIOS (1989). Such an algorithm is incorporated into the ALIGNprogram (version 2.0) which is part of the CGC sequence alignmentsoftware package. When utilizing the ALIGN program for comparing aminoacid sequences, a PAM120 weight residue table, a gap length penalty of12, and a gap penalty of 4 can be used. Additional algorithms forsequence analysis are known in the art and include ADVANCE and ADAM asdescribed in Torellis et al. (1994) Comput. Appl. Biosci. 10:3-5; andFASTA described in Pearson et al. (1988) PNAS 85:2444-8.

A variant polypeptide can differ in amino acid sequence by one or moresubstitutions, deletions, insertions, inversions, fusions, andtruncations or a combination of any of these.

Variant polypeptides can be fully functional or can lack function in oneor more activities. Thus, in the present case, variations can affect thefunction of the lipase at a variety of biological levels, including,disrupting interactions with the proteoglycans, such as CSPG, HSPG,DSPG, disrupting interaction with cell surface receptors, such as theLDL receptor, LDL-related receptor protein, gp330, or the VLDL receptor,disrupting interactions with heparin, disrupting interactions withapoproteins or lipoproteins, disrupting interactions with activatormolecules, such as apo CII or colipase, disrupting triglyceride lipaseactivity or phospholipase activity, or disrupting homodimer formation.Variant polypeptides having such defects have been identified for LPLand are described in, for example, Murthy et al. (1996) Pharmacol. Ther.70: 101-135, incorporated herein by reference for teaching thesevariations.

Fully functional variants typically contain only conservative variationor variation in non-critical residues or in non-critical regions.Functional variants can also contain substitution of similar aminoacids, which results in no change or an insignificant change infunction. Alternatively, such substitutions may positively or negativelyaffect function to some degree.

Non-functional variants typically contain one or more non-conservativeamino acid substitutions, deletions, insertions, inversions, ortruncation or a substitution, insertion, inversion, or deletion in acritical residue or critical region.

As indicated, variants can be naturally-occurring or can be made byrecombinant means or chemical synthesis to provide useful and novelcharacteristics for the lipase polypeptide. This includes preventingimmunogenicity from pharmaceutical formulations by preventing proteinaggregation.

Useful variations further include alteration of catalytic activity. Forexample, one embodiment involves a variation at the binding site thatresults in binding but not hydrolysis, or slower hydrolysis, of thetriglyceride or phospholipid. A further useful variation results in anincreased rate of hydrolysis of the triglycerides or phospholipids.Additional variations include altered affinity for co-activatorproteins, cell surface receptors, proteoglycans, heparin, triglycerides,phospholipids, lipoproteins or apoproteins. A further useful variationat the same site can result in higher or lower affinity for substrates.Useful variations also include changes that result in affinity to adifferent lipoprotein or lipoprotein remnant than that normallyrecognized. Other variations could result in altered recognition ofapoproteins thereby changing the preferred lipoproteins hydrolyzed bythe lipase. Further useful variations affect the ability of the lipaseto be induced by various activators, including, but not limited to,those disclosed herein. Specific variations include truncations in whicha catalytic domain or substrate binding domain is deleted. Thisvariation results in a decrease or loss of lipid hydrolytic activity orsubstrate binding. Another useful variation includes one that preventsglycosylation. Further useful variations provide a fusion protein inwhich one or more domains or subregions are operationally fused to oneor more domains or subregions from another lipase. Specifically, adomain or subregion can be introduced that provides a rescue function toan enzyme not normally having this function or for recognition of aspecific substrate wherein recognition is not available to the originalenzyme. Further variations could affect specific subunit interaction,particularly required for homodimerization or interaction with activatorproteins. Other variations would affect developmental, temporal, ortissue-specific expression. Other variations would affect theinteraction with cellular components, such as transcriptional regulatoryfactors.

Amino acids that are essential for function can be identified by methodsknown in the art, such as site-directed mutagenesis or alanine-scanningmutagenesis (Cunningham et al. (1985) Science 244:1081-1085). The latterprocedure introduces single alanine mutations at every residue in themolecule. The resulting mutant molecules are then tested for biologicalactivity, such as the ability to hydrolyze triglycerides orphospholipids in vitro. Alternatively, in vitro activity may be measuredby the ability to interact with various molecules, including but notlimited to, heparin, proteoglycans, cell surface receptors,lipoproteins, apoproteins or activator proteins. Sites that are criticalfor binding or recognition can also be determined by structural analysissuch as crystallization, nuclear magnetic resonance or photoaffinitylabeling (Smith et al. (1992) J. Mol. Biol. 224:899-904; de Vos et al.(1992) Science 255:306-312).

The assays for lipase enzyme activity are well known in the art and canbe found, for example, in Brun et al. (1989) Metabolism 38:1005-1009,Brunzell et al. (1992) Atherosclerosis IX, Stein (eds.) R&L CreativeCommunications Ltd., Tel Aviv 271-273, Peeva et al. (1992) Int. J. Obes.Relat. Metab. Disord. 16: 737-744, Ma et al. (1991) N. Engl. J. Med.324: 1761-1766, Ma et al. (1992) J. Biol. Chem. 267: 1918-1923, Connellyet al. (1987) J. Clin. Invest. 80: 1597-1606, Huff et al. (1990) J.Lipid Res. 31: 385-396, and Hixson et al. (1990) J Lipid Res. 31:545-548. These assays include measurements of triglyceride orlipoprotein concentrations in the blood stream. For lipases associatedwith proteoglycans, plasma lipolytic activity may be determinedfollowing heparin treatment. In this protocol, lipase activity ismeasured with a synthetic triglyceride substrate using plasma samplesobtained following heparin administration. Post-heparin plasma may alsobe used to measure the lipase mass by immunoassay to determine if acatalytically defective lipase enzyme is released into the plasma.Lipase activity can also be determined in s.c. biopsies of adiposetissue and through the detection of lipase gene mutations. Additionalassays include measuring lipase activation by the co-activatormolecules.

Substantial homology can be to the entire nucleic acid or amino acidsequence or to fragments of these sequences.

The invention thus also includes polypeptide fragments of the lipase.Fragments can be derived from the amino acid sequence shown in SEQ IDNO:17. However, the invention also encompasses fragments of the variantsof the lipase as described herein.

The fragments to which the invention pertains, however, are not to beconstrued as encompassing fragments that may be disclosed prior to thepresent invention.

Accordingly, a fragment can comprise at least about 8, 13, 15, 20, 25,30, 35, 40, 45, 50 or more contiguous amino acids. Fragments can retainone or more of the biological activities of the protein, for example theability to bind to polyglycan, interact with cell surface receptors,interact with activator molecules, catalyze triglyceride hydrolysis, orretain phospholipase activity. Fragments can be used as an immunogen togenerate lipase antibodies.

Biologically active fragments (peptides which are, for example, 5, 7,10, 12, 15, 20, 30, 35, 36, 37, 38, 39, 40, 50, 100 or more amino acidsin length) can comprise a domain or motif, e.g., catalytic sites, signalpeptides, transmembrane segments, and sites for protein kinase Cphosphorylation, casein kinase II phosphorylation, and N-myristoylation.Additional domains include catalytic domains involved in triglyceridehydrolysis and phospholipase activity, heparin binding sites,cell-surface receptor binding sites, triglyceride binding sites, sitesimportant for homodimerization or activator interaction, and sitesimportant for carrying out the other functions of the lipase asdescribed herein.

Such domains or motifs can be identified by means of routinecomputerized homology searching procedures.

Fragments, for example, can extend in one or both directions from thefunctional site to encompass 5, 10, 15, 20, 30, 40, 50, or up to 100amino acids. Further, fragments can include sub-fragments of thespecific domains mentioned above, which sub-fragments retain thefunction of the domain from which they are derived.

These regions can be identified by well-known methods involvingcomputerized homology analysis.

The invention also provides fragments with immunogenic properties. Thesecontain an epitope-bearing portion of the lipase and variants. Theseepitope-bearing peptides are useful to raise antibodies that bindspecifically to a lipase polypeptide or region or fragment. Thesepeptides can contain at least 8, at least 10, 13, 15, or between atleast about 16 to about 30 amino acids.

Non-limiting examples of antigenic polypeptides that can be used togenerate antibodies include but are not limited to peptides derived froman extracellular site. Regions having a high antigenicity index areshown in FIG. 40. However, intracellularly-made antibodies(“intrabodies”) are also encompassed, which would recognizeintracellular peptide regions.

The epitope-bearing lipase polypeptides may be produced by anyconventional means (Houghten, R. A. (1985) Proc. Natl. Acad. Sci. USA82:5131-5135). Simultaneous multiple peptide synthesis is described inU.S. Pat. No. 4,631,211.

Fragments can be discrete (not fused to other amino acids orpolypeptides) or can be within a larger polypeptide. Further, severalfragments can be comprised within a single larger polypeptide. In oneembodiment a fragment designed for expression in a host can haveheterologous pre- and pro-polypeptide regions fused to the aminoterminus of the lipase fragment and an additional region fused to thecarboxyl terminus of the fragment.

The invention thus provides chimeric or fusion proteins. These comprisea lipase peptide sequence operatively linked to a heterologous peptidehaving an amino acid sequence not substantially homologous to thelipase. “Operatively linked” indicates that the lipase peptide and theheterologous peptide are fused in-frame. The heterologous peptide can befused to the N-terminus or C-terminus of the lipase or can be internallylocated.

In one embodiment the fusion protein does not affect lipase function perse. For example, the fusion protein can be a GST-fusion protein in whichthe lipase sequences are fused to the N- or C-terminus of the GSTsequences. Other types of fusion proteins include, but are not limitedto, enzymatic fusion proteins, for example beta-galactosidase fusions,yeast two-hybrid GAL-4 fusions, poly-His fusions and Ig fusions. Suchfusion proteins, particularly poly-His fusions, can facilitate thepurification of a recombinant lipase protein. In certain host cells(e.g., mammalian host cells), expression and/or secretion of a proteincan be increased by using a heterologous signal sequence. Therefore, inanother embodiment, the fusion protein contains a heterologous signalsequence at its N-terminus.

EP-A-O 464 533 discloses fusion proteins comprising various portions ofimmunoglobulin constant regions. The Fc is useful in therapy anddiagnosis and thus results, for example, in improved pharmacokineticproperties (EP-A 0232 262). In drug discovery, for example, humanproteins have been fused with Fc portions for the purpose ofhigh-throughput screening assays to identify antagonists (Bennett et al.(1995) J. Mol. Recog. 8:52-58 (1995) and Johanson et al. J. Biol. Chem.270:9459-9471). Thus, this invention also encompasses soluble fusionproteins containing a lipase polypeptide and various portions of theconstant regions of heavy or light chains of immunoglobulins of varioussubclass (IgG, IgM, IgA, IgE). Preferred as immunoglobulin is theconstant part of the heavy chain of human IgG, particularly IgG1, wherefusion takes place at the hinge region. For some uses it is desirable toremove the Fc after the fusion protein has been used for its intendedpurpose, for example when the fusion protein is to be used as antigenfor immunizations. In a particular embodiment, the Fc part can beremoved in a simple way by a cleavage sequence, which is alsoincorporated and can be cleaved with factor Xa.

A chimeric or fusion protein can be produced by standard recombinant DNAtechniques. For example, DNA fragments coding for the different proteinsequences are ligated together in-frame in accordance with conventionaltechniques. In another embodiment, the fusion gene can be synthesized byconventional techniques including automated DNA synthesizers.Alternatively, PCR amplification of gene fragments can be carried outusing anchor primers which give rise to complementary overhangs betweentwo consecutive gene fragments which can subsequently be annealed andre-amplified to generate a chimeric gene sequence (see Ausubel et al.(1992) Current Protocols in Molecular Biology). Moreover, manyexpression vectors are commercially available that already encode afusion moiety (e.g., a GST protein). A lipase-encoding nucleic acid canbe cloned into such an expression vector such that the fusion moiety islinked in-frame to the lipase.

Another form of fusion protein is one that directly affects lipasefunctions. Accordingly, a lipase polypeptide is encompassed by thepresent invention in which one or more of the lipase domains (or partsthereof) has been replaced by homologous lipase domains (or partsthereof) from another species. Accordingly, various permutations arepossible. One or more functional sites as disclosed herein from thespecifically disclosed lipase can be replaced by one or more functionalsites from a corresponding lipase of another species. Thus, chimericlipases can be formed in which one or more of the native domains orsubregions has been replaced by another. For example, the catalyticdomain of the lipase of the present invention may be replaced by thecatalytic domain of a different lipase polypeptide. Alternatively,protein domains that mediate the interaction with lipoproteins ordomains that mediated the uptake of lipoproteins by cell surfacereceptors can be used to replace homologous domains of the lipase of thepresent invention. In doing so the binding affinity to varioussubstrates and/or the rate of catalysis is altered.

Additionally, chimeric lipase proteins can be produced in which one ormore functional sites is derived from a different member of the lipasesuperfamily. It is understood however that sites could be derived fromlipase families that occur in the mammalian genome but which have notyet been discovered or characterized. Such sites include but are notlimited to any of the functional sites disclosed herein.

The isolated lipase can be purified from any of the cells that naturallyexpress it, including, but not limited to those shown in FIGS. 43, 44,and 45. Tissues in which the gene is highly expressed include liver,fetal liver, breast, brain, fetal kidney, and testis. Moderateexpression occurs in prostate, skeletal muscle, colon, kidney, andthyroid. Lower positive expression occurs in heart, fetal heart, smallintestine, spleen, lung, ovary, vein, aorta, placenta, osteoblasts,cervix, esophagus, thymus, tonsil, and lymph node. The lipase is alsoexpressed in normal liver and in normal and malignant breast, lung, andcolon tissue and in liver metastases derived from malignant colonictissues. Alternatively, the lipase may be purified from cells that havebeen altered to express it (recombinant), or synthesized using knownprotein synthesis methods.

In one embodiment, the protein is produced by recombinant DNAtechniques. For example, a nucleic acid molecule encoding the lipasepolypeptide is cloned into an expression vector, the expression vectorintroduced into a host cell and the protein expressed in the host cell.The protein can then be isolated from the cells by an appropriatepurification scheme using standard protein purification techniques.

Polypeptides often contain amino acids other than the 20 amino acidscommonly referred to as the 20 naturally-occurring amino acids. Further,many amino acids, including the terminal amino acids, may be modified bynatural processes, such as processing and other post-translationalmodifications, or by chemical modification techniques well known in theart. Common modifications that occur naturally in polypeptides aredescribed in basic texts, detailed monographs, and the researchliterature, and they are well known to those of skill in the art.

Accordingly, the polypeptides also encompass derivatives or analogs inwhich a substituted amino acid residue is not one encoded by the geneticcode, in which a substituent group is included, in which the maturepolypeptide is fused with another compound, such as a compound toincrease the half-life of the polypeptide (for example, polyethyleneglycol), or in which the additional amino acids are fused to the maturepolypeptide, such as a leader or secretory sequence or a sequence forpurification of the mature polypeptide or a pro-protein sequence.

Known modifications include, but are not limited to, acetylation,acylation, ADP-ribosylation, amidation, covalent attachment of flavin,covalent attachment of a heme moiety, covalent attachment of anucleotide or nucleotide derivative, covalent attachment of a lipid orlipid derivative, covalent attachment of phosphatidylinositol,cross-linking, cyclization, disulfide bond formation, demethylation,formation of covalent crosslinks, formation of cystine, formation ofpyroglutamate, formylation, gamma carboxylation, glycosylation, GPIanchor formation, hydroxylation, iodination, methylation,myristoylation, oxidation, proteolytic processing, phosphorylation,prenylation, racemization, selenoylation, sulfation, transfer-RNAmediated addition of amino acids to proteins such as arginylation, andubiquitination.

Such modifications are well known to those of skill in the art and havebeen described in great detail in the scientific literature. Severalparticularly common modifications, glycosylation, lipid attachment,sulfation, gamma-carboxylation of glutamic acid residues, hydroxylationand ADP-ribosylation, for instance, are described in most basic texts,such as Proteins—Structure and Molecular Properties, 2nd ed., T.E.Creighton, W.H. Freeman and Company, New York (1993). Many detailedreviews are available on this subject, such as by Wold, F.,Posttranslational Covalent Modification of Proteins, B. C. Johnson, Ed.,Academic Press, New York 1-12 (1983); Seifter et al. (1990) Meth.Enzymol. 182: 626-646) and Rattan et al. (1992) Ann. N.Y. Acad. Sci.663:48-62).

As is also well known, polypeptides are not always entirely linear. Forinstance, polypeptides may be branched as a result of lipase, and theymay be circular, with or without branching, generally as a result ofpost-translation events, including natural processing events and eventsbrought about by human manipulation which do not occur naturally.Circular, branched and branched circular polypeptides may be synthesizedby non-translational natural processes and by synthetic methods.

Modifications can occur anywhere in a polypeptide, including the peptidebackbone, the amino acid side-chains and the amino or carboxyl termini.Blockage of the amino or carboxyl group in a polypeptide, or both, by acovalent modification, is common in naturally-occurring and syntheticpolypeptides. For instance, the aminoterminal residue of polypeptidesmade in E. coli, prior to proteolytic processing, almost invariably willbe N-formylmethionine.

The modifications can be a function of how the protein is made. Forrecombinant polypeptides, for example, the modifications will bedetermined by the host cell posttranslational modification capacity andthe modification signals in the polypeptide amino acid sequence.Accordingly, when glycosylation is desired, a polypeptide should beexpressed in a glycosylating host, generally a eukaryotic cell. Insectcells often carry out the same posttranslational glycosylations asmammalian cells and, for this reason, insect cell expression systemshave been developed to efficiently express mammalian proteins havingnative patterns of glycosylation. Similar considerations apply to othermodifications.

The same type of modification may be present in the same or varyingdegree at several sites in a given polypeptide. Also, a givenpolypeptide may contain more than one type of modification.

Polypeptide Uses

The protein sequences of the present invention can be used as a “querysequence” to perform a search against public databases to, for example,identify other family members or related sequences. Such searches can beperformed using the NBLAST and XBLAST programs (version 2.0) of Altschulet al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can beperformed with the NBLAST program, score=100, wordlength=12 to obtainnucleotide sequences homologous to the nucleic acid molecules of theinvention. BLAST protein searches can be performed with the XBLASTprogram, score=50, wordlength=3 to obtain amino acid sequenceshomologous to the proteins of the invention. To obtain gapped alignmentsfor comparison purposes, Gapped BLAST can be utilized as described inAltschul et al., (1997) Nucleic Acids Res. 25(17):3389-3402. Whenutilizing BLAST and Gapped BLAST programs, the default parameters of therespective programs (e.g., XBLAST and NBLAST) can be used. Seewww.ncbi.nlm.nih.gov.

The lipase polypeptides are useful for producing antibodies specific forthe lipase protein, regions, or fragments. Regions having a highantigenicity index score are shown in FIG. 40.

The lipase polypeptides are useful for biological assays related tolipase function. Such assays involve any of the known functions oractivities or properties useful for diagnosis and treatment of lipase-or lipase-related conditions or conditions in which expression of thelipase is relevant, such as in hypertriacylglycerolaemia, obesity,atherogenesis, and the various other conditions described herein.Potential assays have been disclosed herein.

The lipase polypeptides are also useful in drug screening assays, incell-based or cell-free systems. Cell-based systems can be native, i.e.,cells that normally express the lipase, as a biopsy or expanded in cellculture. In one embodiment, however, cell-based assays involverecombinant host cells expressing the lipase.

Determining the ability of the test compound to interact with the lipasecan also comprise determining the ability of the test compound topreferentially bind to the polypeptide as compared to the ability of aknown binding molecule (e.g., an activator (such as colipase, apo CII),cell surface receptor, heparin, proteoglycan, triglyceride, orphospholipid, or lipoprotein) to bind to the polypeptide.

The polypeptides can be used to identify compounds that modulate lipaseactivity. Modulators of lipase activity comprise agents that influencethe enzyme at a variety of biological levels, including, but not limitedto agents that disrupt the interaction with the proteoglycans of thecell wall, such as HSPG-degrading enzymes, heparin, chlorate, or APOE;agents that disrupt the interaction with cell surface receptors; agentswhich disrupt the interaction with activator molecules or homodimerformation; agents that disrupt interaction with lipoproteins; or agentsthat disrupt triglyceride hydrolysis or phospholipase activity.

The tissue specific regulation of lipase is complex with identicalmodulators regulating activity differently under various metabolicconditions. While specific modulators of lipase activity have beendescribed above, additional modulators include, but are not limited to,apoproteins and a non-proteoglycan LPL-binding protein having sequencehomology to apo B and apo B (Sivaram et al. (1992) J. Biol. Chem.267:16517-16552; Sivaram et al. (1994) J. Biol. Chem. 269:9409-9412). Ithas also been postulated that the lipolysis-stimulated receptor (LSR)plays a role in LPL activation (Yen et al. (1994) Biochemistry33:1172-1180). Additional modulators of lipase activity include,fasting, feeding, growth hormone, insulin, exercise, estrogen, thyroidhormone, catecholamines, hormones of the adrenergic system, vitamin Dderivatives, glucagon, catecholamines, glucocorticoids, and 1, 25dihydroxy-vitamin D. Further modulators comprise inflammatory mediatorssuch as cytokines, interleukins, and interferons.

Modulators associated with an increase activity of lipase activityinclude, but are not limited to various apoproteins, such as apo CII,and glycosylation. Furthermore, lipase enzymatic activity is stabilizedin the presence of lipids or by binding to lipid-water interfaces anddetergents, such as deoxycholate. Modulators associated with a decreasein lipase activity include, but are not limited to, increasedconcentrations of apo CII or apo cm (Shirari et al. (1981) Biochim.Biophys. Acta 665:504-510), TNF (Kern et al. (1997) Journal of Nutrition127:1917 S-1922S), fatty acids, high salt concentrations, and Orlistar(La Roche, Basele).

Both transcription and post-transcriptional levels of lipase expressionare regulated by various dietary, environmental, and developmentalfactors and include, for example, hormones, such as insulin, thyroidhormone, and glucocorticoids (Pykalisto et al. (1976) J. clin.Endocronol. Metab. 43:591-600; Nillson-Ehle et al. (1980) Annual RevBiochem 49:667-693; and Cryer et al. (1981) Int. J. Biochem 13:525-541).Various transcriptional factors such as CEBP, ADD-1, SREBP-1 and PPAR δalso regulate expression of specific lipases. It is understood,therefore, that such compounds can be identified not only by means ofdirect interaction with the lipase, but by means of any of thecomponents that functionally interact with the disclosed lipase. Thisincludes, but is not limited to, any of those components disclosedherein.

Both lipase and appropriate variants and fragments can be used inhigh-throughput screens to assay candidate compounds for the ability tobind to the lipase. These compounds can be further screened against afunctional lipase to determine the effect of the compound on the lipaseactivity. Compounds can be identified that activate (agonist) orinactivate (antagonist) the lipase to a desired degree. Modulatorymethods can be performed in vitro (e.g., by culturing the cell with theagent) or, alternatively, in vivo (e.g., by administering the agent to asubject).

The lipase polypeptides can be used to screen a compound for the abilityto stimulate or inhibit interaction between the lipase protein and atarget molecule that normally interacts with the lipase protein. Thetarget can be a lipoprotein, lipoprotein remnant, apoprotein, cellsurface receptors, heparin, proteoglycan, triglyceride, phospholipid oranother component of the pathway with which the lipase protein normallyinteracts. The assay includes the steps of combining the lipase proteinwith a candidate compound under conditions that allow the lipase proteinor fragment to interact with the target molecule, and to detect theformation of a complex between the lipase protein and the target or todetect the biochemical consequence of the interaction with the lipaseand the target. Any of the associated effects of triglyceride hydrolysisor phospholipase function can be assayed. This includes the productionof fatty acids from triglycerides and phospholipids.

Determining the ability of the lipase to bind to a target molecule canalso be accomplished using a technology such as real-time BimolecularInteraction Analysis (BIA). Sjolander et al. (1991) Anal. Chem.63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol.5:699-705. As used herein, “BIA” is a technology for studyingbiospecific interactions in real time, without labeling any of theinteractants (e.g., BIAcore™). Changes in the optical phenomenon surfaceplasmon resonance (SPR) can be used as an indication of real-timereactions between biological molecules.

The test compounds of the present invention can be obtained using any ofthe numerous approaches in combinatorial library methods known in theart, including: biological libraries; spatially addressable parallelsolid phase or solution phase libraries; synthetic library methodsrequiring deconvolution; the ‘one-bead one-compound’ library method; andsynthetic library methods using affinity chromatography selection. Thebiological library approach is limited to polypeptide libraries, whilethe other four approaches are applicable to polypeptide, non-peptideoligomer or small molecule libraries of compounds (Lam, K. S. (1997)Anticancer Drug Des. 12:145).

Examples of methods for the synthesis of molecular libraries can befound in the art, for example in DeWitt et al. (1993) Proc. Natl. Acad.Sci. USA 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422;Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993)Science 261:1303; Carell et al. (1994) Angew. Chem. Int. Ed. Engl.33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; andin Gallop et al. (1994) J. Med. Chem. 37:1233. Libraries of compoundsmay be presented in solution (e.g., Houghten (1992) Biotechniques13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor(1993) Nature 364:555-556), bacteria (Ladner U.S. Pat. No. 5,223,409),spores (Ladner U.S. Pat. No. '409), plasmids (Cull et al. (1992) Proc.Natl. Acad. Sci. USA 89:1865-1869) or on phage (Scott and Smith (1990)Science 249:386-390); (Devlin (1990) Science 249:404-406); (Cwirla etal. (1990) Proc. Natl. Acad. Sci. 97:6378-6382); (Felici (1991) J. Mol.Biol. 222:301-310); (Ladner supra).

Candidate compounds include, for example, 1) peptides such as solublepeptides, including Ig-tailed fusion peptides and members of randompeptide libraries (see, e.g., Lam et al. (1991) Nature 354:82-84;Houghten et al. (1991) Nature 354:84-86) and combinatorialchemistry-derived molecular libraries made of D- and/or L-configurationamino acids; 2) phosphopeptides (e.g., members of random and partiallydegenerate, directed phosphopeptide libraries, see, e.g., Songyang etal. (1993) Cell 72:767-778); 3) antibodies (e.g., polyclonal,monoclonal, humanized, anti-idiotypic, chimeric, and single chainantibodies as well as Fab, F(ab′)₂, Fab expression library fragments,and epitope-binding fragments of antibodies); and 4) small organic andinorganic molecules (e.g., molecules obtained from combinatorial andnatural product libraries).

One candidate compound is a soluble full-length lipase or fragment thatcompetes for substrate binding. Other candidate compounds include mutantlipases or appropriate fragments containing mutations that affect lipasefunction and compete for substrate. Accordingly, a fragment thatcompetes for substrate, for example with a higher affinity, or afragment that binds substrate but does not hydrolyze the triglyceride orphospholipid, is encompassed by the invention.

Other candidate compounds include lipase protein or protein analog thatbinds to the lipid, lipoprotein, proteoglycan, cell surface receptor, orother substrates identified herein but is not released or releasedslowly. Other candidate compounds include analogs of the other naturalsubstrates, such as substrates that bind to but are not released orreleased more slowly. Further candidate compounds include activators ofthe lipases, including but not limited to, those disclosed herein.

The invention provides other end points to identify compounds thatmodulate (stimulate or inhibit) lipase activity. The assays typicallyinvolve an assay of events in the pathway that indicate lipase activity.This can include cellular events that are influenced by lipidmetabolism, such as but not limited to, lipid or lipoproteinconcentrations. Specific phenotypes include metabolic consequencesincluding effects on energy homeostasis, body weight and bodycomposition-parameters.

Assays are based on the multiple cellular functions of lipase enzymes.As described herein, these enzymes act at various levels in theregulation of lipid metabolism. Accordingly, assays can be based ondetection of any of the products produced by the lipase enzyme.

Further, the expression of genes that are up- or down-regulated byaction of the lipase can be assayed. In one embodiment, the regulatoryregion of such genes can be operably linked to a marker that is easilydetectable, such as luciferase.

Accordingly, any of the biological or biochemical functions mediated bythe lipase can be used as an endpoint assay. These include all of thebiochemical or biochemical/biological events described herein, in thereferences cited herein, incorporated by reference for these endpointassay targets, and other functions known to those of ordinary skill inthe art.

Binding and/or activating compounds can also be screened by usingchimeric lipase proteins in which one or more domains, sites, and thelike, as disclosed herein, or parts thereof, can be replaced by theirheterologous counterparts derived from other lipase protein. Forexample, a recognition or binding region can be used that interacts withdifferent substrate specificity and/or affinity than the native lipase.Accordingly, a different set of pathway components is available as anend-point assay for activation. Further, sites that are responsible fordevelopmental, temporal, or tissue specificity can be replaced byheterologous sites such that the lipase can be detected under conditionsof specific developmental, temporal, or tissue-specific expression.

The lipase polypeptides are also useful in competition binding assays inmethods designed to discover compounds that interact with the lipase.Thus, a compound is exposed to a lipase polypeptide under conditionsthat allow the compound to bind to or to otherwise interact with thepolypeptide. A lipase target, comprising a polypeptide or agent which isknown to interact with lipase, is also added to the mixture. If the testcompound interacts with the soluble lipase polypeptide, it decreases theamount of complex formed or the activity from the lipase target. Thistype of assay is particularly useful in cases in which compounds aresought that interact with specific regions of the lipase. Thus, thesoluble polypeptide that competes with the target lipase region isdesigned to contain peptide sequences corresponding to the region ofinterest.

Another type of competition-binding assay can be used to discovercompounds that interact with specific functional sites. As an example, acandidate compound can be added to a sample of the lipase. Compoundsthat interact with the lipase at the same site as a lipase substratedisclosed herein will reduce the amount of complex formed between thelipase and substrate. Accordingly, it is possible to discover a compoundthat specifically prevents interaction between the lipase and it varioussubstrates. A compound that competes with lipase catalytic activity willreduce the rate of triglyceride or phospholipid hydrolysis.Alternatively, a compound may also compete at the level of substrateinteraction. Accordingly, compounds can be discovered that directlyinteract with the lipase and interfere with its function. Such assayscan involve any other component that interacts with the lipase such asheparin, proteoglycans, lipoproteins, lipoprotein remnants, cell surfacereceptors, triglycerides, phospholipids, activator proteins, and othercompounds described herein.

To perform cell free drug screening assays, it is desirable toimmobilize either the lipase, or fragment, or its target molecule tofacilitate separation of complexes from uncomplexed forms of one or bothof the proteins, as well as to accommodate automation of the assay.

Techniques for immobilizing proteins on matrices can be used in the drugscreening assays. In one embodiment, a fusion protein can be providedwhich adds a domain that allows the protein to be bound to a matrix. Forexample, glutathione-S-transferase/lipase fusion proteins can beadsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis,Mo.) or glutathione derivatized microtitre plates, which are thencombined with the cell lysates (e.g., ³⁵S-labeled) and the candidatecompound, and the mixture incubated under conditions conducive tocomplex formation (e.g., at physiological conditions for salt and pH).Following incubation, the beads are washed to remove any unbound label,and the matrix immobilized and radiolabel determined directly, or in thesupernatant after the complexes is dissociated. Alternatively, thecomplexes can be dissociated from the matrix, separated by SDS-PAGE, andthe level of lipase-binding protein found in the bead fractionquantitated from the gel using standard electrophoretic techniques. Forexample, either the polypeptide or its target molecule can beimmobilized utilizing conjugation of biotin and streptavidin usingtechniques well known in the art. Alternatively, antibodies reactivewith the protein but which do not interfere with binding of the proteinto its target molecule can be derivatized to the wells of the plate, andthe protein trapped in the wells by antibody conjugation. Preparationsof a lipase-binding target component, such as, activator proteins, cellsurface receptors, lipoproteins, apoproteins, triglycerides orphospholipids and a candidate compound are incubated in thelipase-presenting wells and the amount of complex trapped in the wellcan be quantitated. Methods for detecting such complexes, in addition tothose described above for the GST-immobilized complexes, includeimmunodetection of complexes using antibodies reactive with the lipasetarget molecule, or which are reactive with lipase and compete with thetarget molecule; as well as enzyme-linked assays which rely on detectingan enzymatic activity associated with the target molecule.

Modulators of lipase activity identified according to these drugscreening assays can be used to treat a subject with a disorder mediatedor affected by a lipase, by treating cells that express the lipase orcells in which lipase expression is desirable. These methods oftreatment include the steps of administering the modulators of lipaseactivity in a pharmaceutical composition as described herein, to asubject in need of such treatment.

Tissues and/or cells in which the lipase is expressed include, but arenot limited to those shown in FIGS. 43, 44, and 45. Tissues in which thegene is highly expressed include liver, fetal liver, breast, brain,fetal kidney, and testis. Moderate expression occurs in prostate,skeletal muscle, colon, kidney, and thyroid. Lower positive expressionoccurs in heart, fetal heart, small intestine, spleen, lung, ovary,vein, aorta, placenta, osteoblasts, cervix, esophagus, thymus, tonsil,and lymph node. The lipase is also expressed in malignant breast, lung,and colon tissue and in liver metastases derived from malignant colonictissues. Hence, the lipase is relevant to disorders involving thetissues in which it is expressed.

Disorders involving the spleen include, but are not limited to,splenomegaly, including nonspecific acute splenitis, congestivespenomegaly, and spenic infarcts; neoplasms, congenital anomalies, andrupture. Disorders associated with splenomegaly include infections, suchas nonspecific splenitis, infectious mononucleosis, tuberculosis,typhoid fever, brucellosis, cytomegalovirus, syphilis, malaria,histoplasmosis, toxoplasmosis, kala-azar, trypanosomiasis,schistosomiasis, leishmaniasis, and echinococcosis; congestive statesrelated to partial hypertension, such as cirrhosis of the liver, portalor splenic vein thrombosis, and cardiac failure; lymphohematogenousdisorders, such as Hodgkin disease, non-Hodgkin lymphomas/leukemia,multiple myeloma, myeloproliferative disorders, hemolytic anemias, andthrombocytopenic purpura; immunologic-inflammatory conditions, such asrheumatoid arthritis and systemic lupus erythematosus; storage diseasessuch as Gaucher disease, Niemann-Pick disease, andmucopolysaccharidoses; and other conditions, such as amyloidosis,primary neoplasms and cysts, and secondary neoplasms.

Disorders involving the lung include, but are not limited to, congenitalanomalies; atelectasis; diseases of vascular origin, such as pulmonarycongestion and edema, including hemodynamic pulmonary edema and edemacaused by microvascular injury, adult respiratory distress syndrome(diffuse alveolar damage), pulmonary embolism, hemorrhage, andinfarction, and pulmonary hypertension and vascular sclerosis; chronicobstructive pulmonary disease, such as emphysema, chronic bronchitis,bronchial asthma, and bronchiectasis; diffuse interstitial(infiltrative, restrictive) diseases, such as pneumoconioses,sarcoidosis, idiopathic pulmonary fibrosis, desquamative interstitialpneumonitis, hypersensitivity pneumonitis, pulmonary eosinophilia(pulmonary infiltration with eosinophilia), Bronchiolitisobliterans-organizing pneumonia, diffuse pulmonary hemorrhage syndromes,including Goodpasture syndrome, idiopathic pulmonary hemosiderosis andother hemorrhagic syndromes, pulmonary involvement in collagen vasculardisorders, and pulmonary alveolar proteinosis; complications oftherapies, such as drug-induced lung disease, radiation-induced lungdisease, and lung transplantation; tumors, such as bronchogeniccarcinoma, including paraneoplastic syndromes, bronchioloalveolarcarcinoma, neuroendocrine tumors, such as bronchial carcinoid,miscellaneous tumors, and metastatic tumors; pathologies of the pleura,including inflammatory pleural effusions, noninflammatory pleuraleffusions, pneumothorax, and pleural tumors, including solitary fibroustumors (pleural fibroma) and malignant mesothelioma.

Disorders involving the colon include, but are not limited to,congenital anomalies, such as atresia and stenosis, Meckel diverticulum,congenital aganglionic megacolon-Hirschsprung disease; enterocolitis,such as diarrhea and dysentery, infectious enterocolitis, includingviral gastroenteritis, bacterial enterocolitis, necrotizingenterocolitis, antibiotic-associated colitis (pseudomembranous colitis),and collagenous and lymphocytic colitis, miscellaneous intestinalinflammatory disorders, including parasites and protozoa, acquiredimmunodeficiency syndrome, transplantation, drug-induced intestinalinjury, radiation enterocolitis, neutropenic colitis (typhlitis), anddiversion colitis; idiopathic inflammatory bowel disease, such as Crohndisease and ulcerative colitis; tumors of the colon, such asnon-neoplastic polyps, adenomas, familial syndromes, colorectalcarcinogenesis, colorectal carcinoma, and carcinoid tumors.

Disorders involving the liver include, but are not limited to, hepaticinjury; jaundice and cholestasis, such as bilirubin and bile formation;hepatic failure and cirrhosis, such as cirrhosis, portal hypertension,including ascites, portosystemic shunts, and splenomegaly; infectiousdisorders, such as viral hepatitis, including hepatitis A-E infectionand infection by other hepatitis viruses, clinicopathologic syndromes,such as the carrier state, asymptomatic infection, acute viralhepatitis, chronic viral hepatitis, and fulminant hepatitis; autoimmunehepatitis; drug- and toxin-induced liver disease, such as alcoholicliver disease; inborn errors of metabolism and pediatric liver disease,such as hemochromatosis, Wilson disease, α₁-antitrypsin deficiency, andneonatal hepatitis; intrahepatic biliary tract disease, such assecondary biliary cirrhosis, primary biliary cirrhosis, primarysclerosing cholangitis, and anomalies of the biliary tree; circulatorydisorders, such as impaired blood flow into the liver, including hepaticartery compromise and portal vein obstruction and thrombosis, impairedblood flow through the liver, including passive congestion andcentrilobular necrosis and peliosis hepatis, hepatic vein outflowobstruction, including hepatic vein thrombosis (Budd-Chiari syndrome)and veno-occlusive disease; hepatic disease associated with pregnancy,such as preeclampsia and eclampsia, acute fatty liver of pregnancy, andintrehepatic cholestasis of pregnancy; hepatic complications of organ orbone marrow transplantation, such as drug toxicity after bone marrowtransplantation, graft-versus-host disease and liver rejection, andnonimmunologic damage to liver allografts; tumors and tumorousconditions, such as nodular hyperplasias, adenomas, and malignanttumors, including primary carcinoma of the liver and metastatic tumors.

Disorders involving the brain include, but are not limited to, disordersinvolving neurons, and disorders involving glia, such as astrocytes,oligodendrocytes, ependymal cells, and microglia; cerebral edema, raisedintracranial pressure and herniation, and hydrocephalus; malformationsand developmental diseases, such as neural tube defects, forebrainanomalies, posterior fossa anomalies, and syringomyelia and hydromyclia;perinatal brain injury; cerebrovascular diseases, such as those relatedto hypoxia, ischemia, and infarction, including hypotension,hypoperfusion, and low-flow states—global cerebral ischemia and focalcerebral ischemia—infarction from obstruction of local blood supply,intracranial hemorrhage, including intracerebral (intraparenchymal)hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, andvascular malformations, hypertensive cerebrovascular disease, includinglacunar infarcts, slit hemorrhages, and hypertensive encephalopathy;infections, such as acute meningitis, including acute pyogenic(bacterial) meningitis and acute aseptic (viral) meningitis, acute focalsuppurative infections, including brain abscess, subdural empyema, andextradural abscess, chronic bacterial meningoencephalitis, includingtuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis(Lyme disease), viral meningoencephalitis, including arthropod-borne(Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplexvirus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus,poliomyelitis, rabies, and human immunodeficiency virus 1, includingHIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy,AIDS-associated myopathy, peripheral neuropathy, and AIDS in children,progressive multifocal leukoencephalopathy, subacute sclerosingpanencephalitis, fungal meningoencephalitis, other infectious diseasesof the nervous system; transmissible spongiform encephalopathies (priondiseases); demyelinating diseases, including multiple sclerosis,multiple sclerosis variants, acute disseminated encephalomyelitis andacute necrotizing hemorrhagic encephalomyelitis, and other diseases withdemyelination; degenerative diseases, such as degenerative diseasesaffecting the cerebral cortex, including Alzheimer disease and Pickdisease, degenerative diseases of basal ganglia and brain stem,including Parkinsonism, idiopathic Parkinson disease (paralysisagitans), progressive supranuclear palsy, corticobasal degenration,multiple system atrophy, including striatonigral degenration, Shy-Dragersyndrome, and olivopontocerebellar atrophy, and Huntington disease;spinocerebellar degenerations, including spinocerebellar ataxias,including Friedreich ataxia, and ataxia-telanglectasia, degenerativediseases affecting motor neurons, including amyotrophic lateralsclerosis (motor neuron disease), bulbospinal atrophy (Kennedysyndrome), and spinal muscular atrophy; inborn errors of metabolism,such as leukodystrophies, including Krabbe disease, metachromaticleukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, andCanavan disease, mitochondrial encephalomyopathies, including Leighdisease and other mitochondrial encephalomyopathies; toxic and acquiredmetabolic diseases, including vitamin deficiencies such as thiamine(vitamin B₁) deficiency and vitamin B₁₂ deficiency, neurologic sequelaeof metabolic disturbances, including hypoglycemia, hyperglycemia, andhepatic encephatopathy, toxic disorders, including carbon monoxide,methanol, ethanol, and radiation, including combined methotrexate andradiation-induced injury; tumors, such as gliomas, includingastrocytoma, including fibrillary (diffuse) astrocytoma and glioblastomamultiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, andbrain stem glioma, oligodendroglioma, and ependymoma and relatedparaventricular mass lesions, neuronal tumors, poorly differentiatedneoplasms, including medulloblastoma, other parenchymal tumors,including primary brain lymphoma, germ cell tumors, and pinealparenchymal tumors, meningiomas, metastatic tumors, paraneoplasticsyndromes, peripheral nerve sheath tumors, including schwannoma,neurofibroma, and malignant peripheral nerve sheath tumor (malignantschwannoma), and neurocutaneous syndromes (phakomatoses), includingneurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindaudisease.

Disorders involving the heart, include but are not limited to, heartfailure, including but not limited to, cardiac hypertrophy, left-sidedheart failure, and right-sided heart failure; ischemic heart disease,including but not limited to angina pectoris, myocardial infarction,chronic ischemic heart disease, and sudden cardiac death; hypertensiveheart disease, including but not limited to, systemic (left-sided)hypertensive heart disease and pulmonary (right-sided) hypertensiveheart disease; valvular heart disease, including but not limited to,valvular degeneration caused by calcification, such as calcific aorticstenosis, calcification of a congenitally bicuspid aortic valve, andmitral annular calcification, and myxomatous degeneration of the mitralvalve (mitral valve prolapse), rheumatic fever and rheumatic heartdisease, infective endocarditis, and noninfected vegetations, such asnonbacterial thrombotic endocarditis and endocarditis of systemic lupuserythematosus (Libman-Sacks disease), carcinoid heart disease, andcomplications of artificial valves; myocardial disease, including butnot limited to dilated cardiomyopathy, hypertrophic cardiomyopathy,restrictive cardiomyopathy, and myocarditis; pericardial disease,including but not limited to, pericardial effusion and hemopericardiumand pericarditis, including acute pericarditis and healed pericarditis,and rheumatoid heart disease; neoplastic heart disease, including butnot limited to, primary cardiac tumors, such as myxoma, lipoma,papillary fibroelastoma, rhabdomyoma, and sarcoma, and cardiac effectsof noncardiac neoplasms; congenital heart disease, including but notlimited to, left-to-right shunts—late cyanosis, such as atrial septaldefect, ventricular septal defect, patent ductus arteriosus, andatrioventricular septal defect, right-to-left shunts—early cyanosis,such as tetralogy of fallot, transposition of great arteries, truncusarteriosus, tricuspid atresia, and total anomalous pulmonary venousconnection, obstructive congenital anomalies, such as coarctation ofaorta, pulmonary stenosis and atresia, and aortic stenosis and atresia,and disorders involving cardiac transplantation.

Disorders involving the kidney include, but are not limited to,congenital anomalies including, but not limited to, cystic diseases ofthe kidney, that include but are not limited to, cystic renal dysplasia,autosomal dominant (adult) polycystic kidney disease, autosomalrecessive (childhood) polycystic kidney disease, and cystic diseases ofrenal medulla, which include, but are not limited to, medullary spongekidney, and nephronophthisis-uremic medullary cystic disease complex,acquired (dialysis-associated) cystic disease, such as simple cysts;glomerular diseases including pathologies of glomerular injury thatinclude, but are not limited to, in situ immune complex deposition, thatincludes, but is not limited to, anti-GBM nephritis, Heymann nephritis,and antibodies against planted antigens, circulating immune complexnephritis, antibodies to glomerular cells, cell-mediated immunity inglomerulonephritis, activation of alternative complement pathway,epithelial cell injury, and pathologies involving mediators ofglomerular injury including cellular and soluble mediators, acuteglomerulonephritis, such as acute proliferative (poststreptococcal,postinfectious) glomerulonephritis, including but not limited to,poststreptococcal glomerulonephritis and nonstreptococcal acuteglomerulonephritis, rapidly progressive (crescentic) glomerulonephritis,nephrotic syndrome, membranous glomerulonephritis (membranousnephropathy), minimal change disease (lipoid nephrosis), focal segmentalglomerulosclerosis, membranoproliferative glomerulonephritis, IgAnephropathy (Berger disease), focal proliferative and necrotizingglomerulonephritis (focal glomerulonephritis), hereditary nephritis,including but not limited to, Alport syndrome and thin membrane disease(benign familial hematuria), chronic glomerulonephritis, glomerularlesions associated with systemic disease, including but not limited to,systemic lupus erythematosus, Henoch-Schönlein purpura, bacterialendocarditis, diabetic glomerulosclerosis, amyloidosis, fibrillary andimmunotactoid glomerulonephritis, and other systemic disorders; diseasesaffecting tubules and interstitium, including acute tubular necrosis andtubulointerstitial nephritis, including but not limited to,pyelonephritis and urinary tract infection, acute pyelonephritis,chronic pyelonephritis and reflux nephropathy, and tubulointerstitialnephritis induced by drugs and toxins, including but not limited to,acute drug-induced interstitial nephritis, analgesic abuse nephropathy,nephropathy associated with nonsteroidal anti-inflammatory drugs, andother tubulointerstitial diseases including, but not limited to, uratenephropathy, hypercalcemia and nephrocalcinosis, and multiple myeloma;diseases of blood vessels including benign nephrosclerosis, malignanthypertension and accelerated nephrosclerosis, renal artery stenosis, andthrombotic microangiopathies including, but not limited to, classic(childhood) hemolytic-uremic syndrome, adult hemolytic-uremicsyndrome/thrombotic thrombocytopenic purpura, idiopathic HUS/TTP, andother vascular disorders including, but not limited to, atheroscleroticischemic renal disease, atheroembolic renal disease, sickle cell diseasenephropathy, diffuse cortical necrosis, and renal infarcts; urinarytract obstruction (obstructive uropathy); urolithiasis (renal calculi,stones); and tumors of the kidney including, but not limited to, benigntumors, such as renal papillary adenoma, renal fibroma or hamartoma(renomedullary interstitial cell tumor), angiomyolipoma, and oncocytoma,and malignant tumors, including renal cell carcinoma (hypemephroma,adenocarcinoma of kidney), which includes urothelial carcinomas of renalpelvis.

Disorders of the breast include, but are not limited to, disorders ofdevelopment; inflammations, including but not limited to, acutemastitis, periductal mastitis, periductal mastitis (recurrent subareolarabscess, squamous metaplasia of lactiferous ducts), mammary ductectasia, fat necrosis, granulomatous mastitis, and pathologiesassociated with silicone breast implants; fibrocystic changes;proliferative breast disease including, but not limited to, epithelialhyperplasia, sclerosing adenosis, and small duct papillomas; tumorsincluding, but not limited to, stromal tumors such as fibroadenoma,phyllodes tumor, and sarcomas, and epithelial tumors such as large ductpapilloma; carcinoma of the breast including in situ (noninvasive)carcinoma that includes ductal carcinoma in situ (including Paget'sdisease) and lobular carcinoma in situ, and invasive (infiltrating)carcinoma including, but not limited to, invasive ductal carcinoma, nospecial type, invasive lobular carcinoma, medullary carcinoma, colloid(mucinous) carcinoma, tubular carcinoma, and invasive papillarycarcinoma, and miscellaneous malignant neoplasms.

Disorders in the male breast include, but are not limited to,gynecomastia and carcinoma.

Disorders involving the testis and epididymis include, but are notlimited to, congenital anomalies such as cryptorchidism, regressivechanges such as atrophy, inflammations such as nonspecific epididymitisand orchitis, granulomatous (autoimmune) orchitis, and specificinflammations including, but not limited to, gonorrhea, mumps,tuberculosis, and syphilis, vascular disturbances including torsion,testicular tumors including germ cell tumors that include, but are notlimited to, seminoma, spermatocytic seminoma, embryonal carcinoma, yolksac tumor choriocarcinoma, teratoma, and mixed tumors, tumore of sexcord-gonadal stroma including, but not limited to, leydig (interstitial)cell tumors and sertoli cell tumors (androblastoma), and testicularlymphoma, and miscellaneous lesions of tunica vaginalis.

Disorders involving the prostate include, but are not limited to,inflammations, benign enlargement, for example, nodular hyperplasia(benign prostatic hypertrophy or hyperplasia), and tumors such ascarcinoma.

Disorders involving the thyroid include, but are not limited to,hyperthyroidism; hypothyroidism including, but not limited to, cretinismand myxedema; thyroiditis including, but not limited to, hashimotothyroiditis, subacute (granulomatous) thyroiditis, and subacutelymphocytic (painless) thyroiditis; Graves disease; diffuse andmultinodular goiter including, but not limited to, diffuse nontoxic(simple) goiter and multinodular goiter; neoplasms of the thyroidincluding, but not limited to, adenomas, other benign tumors, andcarcinomas, which include, but are not limited to, papillary carcinoma,follicular carcinoma, medullary carcinoma, and anaplastic carcinoma; andcogenital anomalies.

Disorders involving the skeletal muscle include tumors such asrhabdomyosarcoma.

Disorders involving the small intestine include the malabsorptionsyndromes such as, celiac sprue, tropical sprue (postinfectious sprue),whipple disease, disaccharidase (lactase) deficiency,abetalipoproteinemia, and tumors of the small intestine includingadenomas and adenocarcinoma.

Disorders involving blood vessels include, but are not limited to,responses of vascular cell walls to injury, such as endothelialdysfunction and endothelial activation and intimal thickening; vasculardiseases including, but not limited to, congenital anomalies, such asarteriovenous fistula, atherosclerosis, and hypertensive vasculardisease, such as hypertension; inflammatory disease—the vasculitides,such as giant cell (temporal) arteritis, Takayasu arteritis,polyarteritis nodosa (classic), Kawasaki syndrome (mucocutaneous lymphnode syndrome), microscopic polyanglitis (microscopic polyarteritis,hypersensitivity or leukocytoclastic anglitis), Wegener granulomatosis,thromboanglitis obliterans (Buerger disease), vasculitis associated withother disorders, and infectious arteritis; Raynaud disease; aneurysmsand dissection, such as abdominal aortic aneurysms, syphilitic (luetic)aneurysms, and aortic dissection (dissecting hematoma); disorders ofveins and lymphatics, such as varicose veins, thrombophlebitis andphlebothrombosis, obstruction of superior vena cava (superior vena cavasyndrome), obstruction of inferior vena cava (inferior vena cavasyndrome), and lymphangitis and lymphedema; tumors, including benigntumors and tumor-like conditions, such as hemangioma, lymphangioma,glomus tumor (glomangioma), vascular ectasias, and bacillaryangiomatosis, and intermediate-grade (borderline low-grade malignant)tumors, such as Kaposi sarcoma and hemangloendothelioma, and malignanttumors, such as angiosarcoma and hemangiopericytoma; and pathology oftherapeutic interventions in vascular disease, such as balloonangioplasty and related techniques and vascular replacement, such ascoronary artery bypass graft surgery.

Disorders involving the thymus include developmental disorders, such asDiGeorge syndrome with thymic hypoplasia or aplasia; thymic cysts;thymic hypoplasia, which involves the appearance of lymphoid follicleswithin the thymus, creating thymic follicular hyperplasia; and thymomas,including germ cell tumors, lymphomas, Hodgkin disease, and carcinoids.Thymomas can include benign or encapsulated thymoma, and malignantthymoma Type I (invasive thymoma) or Type II, designated thymiccarcinoma.

Disorders involving the ovary include, for example, polycystic ovariandisease, Stein-leventhal syndrome, Pseudomyxoma peritonei and stromalhyperthecosis; ovarian tumors such as, tumors of coelomic epithelium,serous tumors, mucinous tumors, endometeriod tumors, clear celladenocarcinoma, cystadenofibroma, brenner tumor, surface epithelialtumors; germ cell tumors such as mature (benign) teratomas, monodermalteratomas, immature malignant teratomas, dysgerminoma, endodermal sinustumor, choriocarcinoma; sex cord-stomal tumors such as, granulosa-thecacell tumors, thecoma-fibromas, androblastomas, hill cell tumors, andgonadoblastoma; and metastatic tumors such as Krukenberg tumors.

Bone-forming cells include the osteoprogenitor cells, osteoblasts, andosteocytes. The disorders of the bone are complex because they may havean impact on the skeleton during any of its stages of development.Hence, the disorders may have variable manifestations and may involveone, multiple or all bones of the body. Such disorders include,congenital malformations, achondroplasia and thanatophoric dwarfism,diseases associated with abnormal matix such as type 1 collagen disease,osteoporois, paget disease, rickets, osteomalacia, high-turnoverosteodystrophy, low-turnover of aplastic disease, osteonecrosis,pyogenic osteomyelitis, tuberculous osteomyelitism, osteoma, osteoidosteoma, osteoblastoma, osteosarcoma, osteochondroma, chondromas,chondroblastoma, chondromyxoid fibroma, chondrosarcoma, fibrous corticaldefects, fibrous dysplasia, fibrosarcoma, malignant fibroushistiocytoma, ewing sarcoma, primitive neuroectodermal tumor, giant celltumor, and metastatic tumors.

In addition, lipases influence a number of processes which affect thebiology of both blood vessel walls and the pancreas. Therefore, lipasesfind use in the treatment of disorders of blood vessels, which include,but are not limited to, responses of vascular cell walls to injury, suchas endothelial dysfunction and endothelial activation and intimalthickening; vascular diseases including, but not limited to, congenitalanomalies, such as arteriovenous fistula, atherosclerosis, andhypertensive vascular disease, such as hypertension; inflammatorydisease—the vasculitides, such as giant cell (temporal) arteritis,Takayasu arteritis, polyarteritis nodosa (classic), Kawasaki syndrome(mucocutaneous lymph node syndrome), microscopic polyanglitis(microscopic polyarteritis, hypersensitivity or leukocytoclasticanglitis), Wegener granulomatosis, thromboanglitis obliterans (Buergerdisease), vasculitis associated with other disorders, and infectiousarteritis; Raynaud disease; aneurysms and dissection, such as abdominalaortic aneurysms, syphilitic (luetic) aneurysms, and aortic dissection(dissecting hematoma); disorders of veins and lymphatics, such asvaricose veins, thrombophlebitis and phlebothrombosis, obstruction ofsuperior vena cava (superior vena cava syndrome), obstruction ofinferior vena cava (inferior vena cava syndrome), and lymphangitis andlymphedema; tumors, including benign tumors and tumor-like conditions,such as hemangioma, lymphangioma, glomus tumor (glomangioma), vascularectasias, and bacillary angiomatosis, and intermediate-grade (borderlinelow-grade malignant) tumors, such as Kaposi sarcoma andhemangloendothelioma, and malignant tumors, such as angiosarcoma andhemangiopericytoma; and pathology of therapeutic interventions invascular disease, such as balloon angioplasty and related techniques andvascular replacement, such as coronary artery bypass graft surgery.

Disorders involving the pancreas include those of the exocrine pancreassuch as congenital anomalies, including but not limited to, ectopicpancreas; pancreatitis, including but not limited to, acutepancreatitis; cysts, including but not limited to, pseudocysts; tumors,including but not limited to, cystic tumors and carcinoma of thepancreas; and disorders of the endocrine pancreas such as, diabetesmellitus; islet cell tumors, including but not limited to, insulinomas,gastrinomas, and other rare islet cell tumors.

Lipases play critical roles in lipid metabolism and are associated withvarious lipid-related pathologies in humans such as, but not limited to,Wolman's disease, hypertension, Type II diabetes, retinopathy andcholesterol ester storage disease. Furthermore, a decrease in LPLactivity impairs the catabolism of chylomicrons and VLDL resulting inmassive hypertriglyceridemia. Decreased LPL activity has been alsoassociated with many disorders, including for example, chylomicronemiasyndrome. This syndrome has multiple clinical symptoms andmanifestations review by Murthy et al. (1996) Pharmacol. Ther.70:101-135. Additional disorders resulting from defective LPL activityinclude, familial lipoprotein lipase deficiency with fastingchylomicronemia (type I hyperlipidemia) (Santamarina et al. (1992) CurrOpin Lipidology 3:186), LPL deficiency, familial combinedhyperlipidaemia (FCHL) (Babirak et al. (1992) Arteriosclerosis thromb.12:1176; Seed et al. (1994) Clin Invest 72: 100), hypertriglyceridemia,pancreatitis and abnormalities in post prandial lipemia. In addition,LPL activity is abnormally regulated in obesity (Kern et al. (1997) J.Nut. 127: 1917S-1922S) and is also affected by alcohol and severalhormones (Taskinen et al. (1987) Lipoprotein Lipase, Borensztajn J. (ed)Evener Chicago). Furthermore, changes in circulating lipoprotein andcreation of lipolytic products have been implicated in a number ofprocesses that affect the biology of vessel walls. For example,atherogenesis is associated with increased LPL activity. In addition,autoantibodies against LPL have been reported in patients withidiopathic thrombocytopenic purpura and Grave's disease (Kihara et al.(1989) N. Engl. J. Med. 320:1255-1259) and heparin resistance was notedin a case of disseminated lupus erythematosus (Glueck et al. (1969) Am.J. Med. 47:318-324). Polymorphisms in LDL gene have also been associatedwith altered levels of total and HDL cholesterol (Mitchell et al. (1994)Hum. Biol. 66:383-397), coronary heart disease (Mattu et al. (1994)Arterioscler. Thromb. 14:1090-1097), and insulin resistance (Cole et al.(1993) Genet. Epidemiol. 10:177-188).

The hydrolysis of HDL by hepatic lipase regulates cholesterol levels inhepatic tissue. Pathologies associated with cholesterol include, but arenot limited to, atherosclerosis, xanthomas, inflammation and necrosis,cholesterolosis and gall stone formation.

The lipase polypeptides are thus useful for treating a lipase-associateddisorder characterized by aberrant expression or activity of a lipase.The polypeptides can also be useful for treating a disordercharacterized by excessive amounts of lipoproteins, triglycerides orcholesterol. In one embodiment, the method involves administering anagent (e.g., an agent identified by a screening assay described herein),or combination of agents that modulates (e.g., upregulates ordownregulates) expression or activity of the protein. In anotherembodiment, the method involves administering the lipase as therapy tocompensate for reduced or aberrant expression or activity of theprotein. In another embodiment, the lipase polypeptides are useful fortreating breast, lung, colon, and liver cancers.

Methods for treatment include but are not limited to the use of solublelipase or fragments of the lipase protein that compete for substratesincluding those disclosed herein. These lipases or fragments can have ahigher affinity for the target so as to provide effective competition.

Stimulation of activity is desirable in situations in which the proteinis abnormally downregulated and/or in which increased activity is likelyto have a beneficial effect. Likewise, inhibition of activity isdesirable in situations in which the protein is abnormally upregulatedand/or in which decreased activity is likely to have a beneficialeffect. In one example of such a situation, a subject has a disordercharacterized by aberrant metabolism of lipids resulting in alteredlipoprotein concentrations, energy homeostasis, body weight,artherosclerosis, and body weight parameters.

In yet another aspect of the invention, the proteins of the inventioncan be used as “bait proteins” in a two-hybrid assay or three-hybridassay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartelet al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene8:1693-1696; and Brent WO 94/10300), to identify other proteins(captured proteins) which bind to or interact with the proteins of theinvention and modulate their activity.

The lipase polypeptides also are useful to provide a target fordiagnosing a disease or predisposition to disease mediated by thelipase, including, but not limited to, diseases involving tissues inwhich the lipase are expressed as disclosed herein. Accordingly, methodsare provided for detecting the presence, or levels of, the lipase in acell, tissue, or organism. The method involves contacting a biologicalsample with a compound capable of interacting with the lipase such thatthe interaction can be detected.

The polypeptides are also useful for treating a disorder characterizedby reduced amounts of these components. Thus, increasing or decreasingthe activity of the lipase is beneficial to treatment. The polypeptidesare also useful to provide a target for diagnosing a diseasecharacterized by excessive substrate or reduced levels of substrate.Accordingly, where substrate is excessive, use of the lipasepolypeptides can provide a diagnostic assay. Furthermore, for example,lipases having reduced activity can be used to diagnose conditions inwhich reduced substrate is responsible for the disorder.

One agent for detecting lipase is an antibody capable of selectivelybinding to the lipase polypeptide. A biological sample includes tissues,cells and biological fluids isolated from a subject, as well as tissues,cells and fluids present within a subject.

The lipase also provides a target for diagnosing active disease, orpredisposition to disease, in a patient having a variant lipase. Thus,lipase can be isolated from a biological sample and assayed for thepresence of a genetic mutation that results in an aberrant protein. Thisincludes amino acid substitution, deletion, insertion, rearrangement,(as the result of aberrant splicing events), and inappropriatepost-translational modification. Analytic methods include alteredelectrophoretic mobility, altered tryptic peptide digest, altered lipaseactivity in cell-based or cell-free assay, alteration in binding to orhydrolysis of lipids, binding to activator proteins, cell surfacereceptors, apoproteins, lipoproteins, proteoglycans, heparin, orantibody-binding pattern, altered isoelectric point, direct amino acidsequencing, and any other of the known assay techniques useful fordetecting mutations in a protein in general or in a lipase specifically,including assays discussed herein.

In vitro techniques for detection of lipase include enzyme linkedimmunosorbent assays (ELISAs), Western blots, immunoprecipitations andimmunofluorescence. Alternatively, the protein can be detected in vivoin a subject by introducing into the subject a labeled anti-lipaseantibody. For example, the antibody can be labeled with a radioactivemarker whose presence and location in a subject can be detected bystandard imaging techniques. Particularly useful are methods, whichdetect the allelic variant of the lipase expressed in a subject, andmethods, which detect fragments of the lipase in a sample.

The lipase polypeptides are also useful in pharmacogenomic analysis.Pharmacogenomics deal with clinically significant hereditary variationsin the response to drugs due to altered drug disposition and abnormalaction in affected persons. See, e.g., Eichelbaum, M. (1996) Clin. Exp.Pharmacol. Physiol. 23(10-11):983-985, and Linder, M. W. (1997) Clin.Chem. 43(2):254-266. The clinical outcomes of these variations result insevere toxicity of therapeutic drugs in certain individuals ortherapeutic failure of drugs in certain individuals as a result ofindividual variation in metabolism. Thus, the genotype of the individualcan determine the way a therapeutic compound acts on the body or the waythe body metabolizes the compound. Further, the activity of drugmetabolizing enzymes affects both the intensity and duration of drugaction. Thus, the pharmacogenomics of the individual permit theselection of effective compounds and effective dosages of such compoundsfor prophylactic or therapeutic treatment based on the individual'sgenotype. The discovery of genetic polymorphisms in some drugmetabolizing enzymes has explained why some patients do not obtain theexpected drug effects, show an exaggerated drug effect, or experienceserious toxicity from standard drug dosages. Polymorphisms can beexpressed in the phenotype of the extensive metabolizer and thephenotype of the poor metabolizer. Accordingly, genetic polymorphism maylead to allelic protein variants of the lipase in which one or more ofthe lipase functions in one population is different from those inanother population. The polypeptides thus allow a target to ascertain agenetic predisposition that can affect treatment modality. Thus, in alipase-based treatment, polymorphism may give rise to catalytic regionsthat are more or less active. Accordingly, dosage would necessarily bemodified to maximize the therapeutic effect within a given populationcontaining the polymorphism. As an alternative to genotyping, specificpolymorphic polypeptides could be identified.

The lipase polypeptides are also useful for monitoring therapeuticeffects during clinical trials and other treatment. Thus, thetherapeutic effectiveness of an agent that is designed to increase ordecrease gene expression, protein levels or lipase activity can bemonitored over the course of treatment using the lipase polypeptides asan end-point target. The monitoring can be, for example, as follows: (i)obtaining a pre-administration sample from a subject prior toadministration of the agent; (ii) detecting the level of expression oractivity of the protein in the pre-administration sample; (iii)obtaining one or more post-administration samples from the subject; (iv)detecting the level of expression or activity of the protein in thepost-administration samples; (v) comparing the level of expression oractivity of the protein in the pre-administration sample with theprotein in the post-administration sample or samples; and (vi)increasing or decreasing the administration of the agent to the subjectaccordingly.

Antibodies

The invention also provides antibodies that selectively bind to thelipase and its variants and fragments. An antibody is considered toselectively bind, even if it also binds to other proteins that are notsubstantially homologous with the lipase. These other proteins sharehomology with a fragment or domain of the lipase polypeptide. Thisconservation in specific regions gives rise to antibodies that bind toboth proteins by virtue of the homologous sequence. In this case, itwould be understood that antibody binding to the lipase is stillselective.

To generate antibodies, an isolated lipase polypeptide is used as animmunogen to generate antibodies using standard techniques forpolyclonal and monoclonal antibody preparation. Either the full-lengthprotein or antigenic peptide fragment can be used. Regions having a highantigenicity index are shown in FIG. 40.

Antibodies are preferably prepared from these regions or from discretefragments in these regions. However, antibodies can be prepared from anyregion of the peptide as described herein. A preferred fragment producesan antibody that diminishes or completely prevents substrate hydrolysisor binding. Antibodies can be developed against the entire lipaseprotein or domains of the lipase as described herein. Antibodies canalso be developed against specific functional sites as disclosed herein.

The antigenic peptide can comprise a contiguous sequence of at least 8,13, 14, 15, or 30 amino acid residues. In one embodiment, fragmentscorrespond to regions that are located on the surface of the protein,e.g., hydrophilic regions. These fragments are not to be construed,however, as encompassing any fragments, which may be disclosed prior tothe invention.

Antibodies can be polyclonal or monoclonal. An intact antibody, or afragment thereof (e.g., Fab or F(ab′)₂) can be used.

Detection can be facilitated by coupling (i.e., physically linking) theantibody to a detectable substance. Examples of detectable substancesinclude various enzymes, prosthetic groups, fluorescent materials,luminescent materials, bioluminescent materials, and radioactivematerials. Examples of suitable enzymes include horseradish peroxidase,alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examplesof suitable prosthetic group complexes include streptavidin/biotin andavidin/biotin; examples of suitable fluorescent materials includeumbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine,dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; anexample of a luminescent material includes luminol; examples ofbioluminescent materials include luciferase, luciferin, and aequorin,and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or³H.

An appropriate immunogenic preparation can be derived from native,recombinantly expressed, or chemically synthesized peptides.

Antibody Uses

The antibodies can be used to isolate a lipase by standard techniques,such as affinity chromatography or immunoprecipitation. The antibodiescan facilitate the purification of the natural lipase from cells andrecombinantly produced lipase expressed in host cells.

The antibodies are useful to detect the presence of lipase in cells ortissues to determine the pattern of expression of the lipase amongvarious tissues in an organism and over the course of normaldevelopment.

The antibodies can be used to detect lipase in situ, in vitro, or in acell lysate or supernatant in order to evaluate the abundance andpattern of expression.

The antibodies can be used to assess abnormal tissue distribution orabnormal expression during development.

Antibody detection of circulating fragments of the full length lipasecan be used to identify lipase turnover.

Further, the antibodies can be used to assess lipase expression indisease states such as in active stages of the disease or in anindividual with a predisposition toward disease related to lipidmetabolism. When a disorder is caused by an inappropriate tissuedistribution, developmental expression, or level of expression of thelipase protein, the antibody can be prepared against the normal lipaseprotein. If a disorder is characterized by a specific mutation in thelipase, antibodies specific for this mutant protein can be used to assayfor the presence of the specific mutant lipase polypeptides. However,intracellularly-made antibodies (“intrabodies”) are also encompassed,which would recognize intracellular lipase-peptide regions.

The antibodies can also be used to assess normal and aberrantsubcellular localization of cells in the various tissues in an organism.Antibodies can be developed against the whole lipase or portions of thelipase.

The diagnostic uses can be applied, not only in genetic testing, butalso in monitoring a treatment modality. Accordingly, where treatment isultimately aimed at correcting lipase expression level or the presenceof aberrant lipase proteins and aberrant tissue distribution ordevelopmental expression, antibodies directed against the lipase orrelevant fragments can be used to monitor therapeutic efficacy.

Antibodies accordingly can be used diagnostically to monitor proteinlevels in tissue as part of a clinical testing procedure, e.g., to, forexample, determine the efficacy of a given treatment regimen.

Additionally, antibodies are useful in pharmacogenomic analysis. Thus,antibodies prepared against polymorphic lipases can be used to identifyindividuals that require modified treatment modalities.

The antibodies are also useful as diagnostic tools as an immunologicalmarker for aberrant lipase analyzed by electrophoretic mobility,isoelectric point, tryptic peptide digest, and other physical assaysknown to those in the art.

The antibodies are also useful for tissue typing. Thus, where a specificlipase has been correlated with expression in a specific tissue,antibodies that are specific for this lipase can be used to identify atissue type.

The antibodies are also useful in forensic identification. Accordingly,where an individual has been correlated with a specific geneticpolymorphism resulting in a specific polymorphic protein, an antibodyspecific for the polymorphic protein can be used as an aid inidentification.

The antibodies are also useful for inhibiting the various lipasefunctions as described herein.

These uses can also be applied in a therapeutic context in whichtreatment involves inhibiting lipase function. Antibodies can beprepared against specific fragments containing sites required forfunction or against intact lipase associated with a cell.

Completely human antibodies are particularly desirable for therapeutictreatment of human patients. For an overview of this technology forproducing human antibodies, see Lonberg et al. (1995) Int. Rev. Immunol.13:65-93. For a detailed discussion of this technology for producinghuman antibodies and human monoclonal antibodies and protocols forproducing such antibodies, e.g., U.S. Pat. No. 5,625,126; U.S. Pat. No.5,633,425; U.S. Pat. No. 5,569,825; U.S. Pat. No. 5,661,016; and U.S.Pat. No. 5,545,806.

The invention also encompasses kits for using antibodies to detect thepresence of a lipase protein in a biological sample. The kit cancomprise antibodies such as a labeled or labelable antibody and acompound or agent for detecting lipase in a biological sample; means fordetermining the amount of lipase in the sample; and means for comparingthe amount of lipase in the sample with a standard. The compound oragent can be packaged in a suitable container. The kit can furthercomprise instructions for using the kit to detect lipase.

Polynucleotides

The nucleotide sequence in SEQ ID NO:18 was obtained by sequencing thedeposited human cDNA. Accordingly, the sequence of the deposited cloneis controlling as to any discrepancies between the two and any referenceto the sequence of SEQ ID NO:18 includes reference to the sequence ofthe deposited cDNA.

The specifically disclosed cDNA comprises the coding region and 5′ and3′ untranslated sequences in SEQ ID NO:18.

The invention provides isolated polynucleotides encoding the novellipase. The term “lipase polynucleotide” or “lipase nucleic acid” refersto the sequence shown in SEQ ID NO:18 or in the deposited cDNA. The term“lipase polynucleotide” or “lipase nucleic acid” further includesvariants and fragments of the lipase polynucleotide.

An “isolated” lipase nucleic acid is one that is separated from othernucleic acid present in the natural source of the lipase nucleic acid.Preferably, an “isolated” nucleic acid is free of sequences whichnaturally flank the lipase nucleic acid (i.e., sequences located at the5′ and 3′ ends of the nucleic acid) in the genomic DNA of the organismfrom which the nucleic acid is derived. However, there can be someflanking nucleotide sequences, for example up to about 5 KB. Theimportant point is that the lipase nucleic acid is isolated fromflanking sequences such that it can be subjected to the specificmanipulations described herein, such as recombinant expression,preparation of probes and primers, and other uses specific to the lipasenucleic acid sequences.

Moreover, an “isolated” nucleic acid molecule, such as a cDNA or RNAmolecule, can be substantially free of other cellular material, orculture medium when produced by recombinant techniques, or chemicalprecursors or other chemicals when chemically synthesized. However, thenucleic acid molecule can be fused to other coding or regulatorysequences and still be considered isolated.

In some instances, the isolated material will form part of a composition(for example, a crude extract containing other substances), buffersystem or reagent mix. In other circumstances, the material may bepurified to essential homogeneity, for example as determined by PAGE orcolumn chromatography such as HPLC. Preferably, an isolated nucleic acidcomprises at least about 50, 80 or 90% (on a molar basis) of allmacromolecular species present.

For example, recombinant DNA molecules contained in a vector areconsidered isolated. Further examples of isolated DNA molecules includerecombinant DNA molecules maintained in heterologous host cells orpurified (partially or substantially) DNA molecules in solution.Isolated RNA molecules include in vivo or in vitro RNA transcripts ofthe isolated DNA molecules of the present invention. Isolated nucleicacid molecules according to the present invention further include suchmolecules produced synthetically.

In some instances, the isolated material will form part of a composition(or example, a crude extract containing other substances), buffer systemor reagent mix. In other circumstances, the material may be purified toessential homogeneity, for example as determined by PAGE or columnchromatography such as HPLC. Preferably, an isolated nucleic acidcomprises at least about 50, 80 or 90% (on a molar basis) of allmacromolecular species present.

The lipase polynucleotides can encode the mature protein plus additionalamino or carboxyterminal amino acids, or amino acids interior to themature polypeptide (when the mature form has more than one polypeptidechain, for instance). Such sequences may play a role in processing of aprotein from precursor to a mature form, facilitate protein trafficking,prolong or shorten protein half-life or facilitate manipulation of aprotein for assay or production, among other things. As generally is thecase in situ, the additional amino acids may be processed away from themature protein by cellular enzymes.

The lipase polynucleotides include, but are not limited to, the sequenceencoding the mature polypeptide alone, the sequence encoding the maturepolypeptide and additional coding sequences, such as a leader orsecretory sequence (e.g., a pre-pro or pro-protein sequence), thesequence encoding the mature polypeptide, with or without the additionalcoding sequences, plus additional non-coding sequences, for exampleintrons and non-coding 5′ and 3′ sequences such as transcribed butnon-translated sequences that play a role in transcription, mRNAprocessing (including splicing and polyadenylation signals), ribosomebinding and stability of mRNA. In addition, the polynucleotide may befused to a marker sequence encoding, for example, a peptide thatfacilitates purification.

Lipase polynucleotides can be in the form of RNA, such as mRNA, or inthe form DNA, including cDNA and genomic DNA obtained by cloning orproduced by chemical synthetic techniques or by a combination thereof.The nucleic acid, especially DNA, can be double-stranded orsingle-stranded. Single-stranded nucleic acid can be the coding strand(sense strand) or the non-coding strand (anti-sense strand).

Lipase nucleic acid can comprise the nucleotide sequence shown in SEQ IDNO:18, corresponding to human cDNA.

In one embodiment, the lipase nucleic acid comprises only the codingregion.

The invention further provides variant lipase polynucleotides, andfragments thereof, that differ from the nucleotide sequence shown in SEQID NO:18 due to degeneracy of the genetic code and thus encode the sameprotein as that encoded by the nucleotide sequence shown in SEQ IDNO:18.

The invention also provides lipase nucleic acid molecules encoding thevariant polypeptides described herein. Such polynucleotides may benaturally occurring, such as allelic variants (same locus), homologs(different locus), and orthologs (different organism), or may beconstructed by recombinant DNA methods or by chemical synthesis. Suchnon-naturally occurring variants may be made by mutagenesis techniques,including those applied to polynucleotides, cells, or organisms.Accordingly, as discussed above, the variants can contain nucleotidesubstitutions, deletions, inversions and insertions.

Typically, variants have a substantial identity with a nucleic acidmolecule of SEQ ID NO:18 and the complements thereof. Variation canoccur in either or both the coding and non-coding regions. Thevariations can produce both conservative and non-conservative amino acidsubstitutions.

Orthologs, homologs, and allelic variants can be identified usingmethods well known in the art. These variants comprise a nucleotidesequence encoding a lipase that is at least about 60-65%, 65-70%,typically at least about 70-75%, more typically at least about 80-85%,and most typically at least about 90-95% or more homologous to thenucleotide sequence shown in SEQ ID NO:18. Such nucleic acid moleculescan readily be identified as being able to hybridize under stringentconditions, to the nucleotide sequence shown in SEQ ID NO:18 or afragment of the sequence. It is understood that stringent hybridizationdoes not indicate substantial homology where it is due to generalhomology, such as poly A sequences, or sequences common to all or mostproteins or all lipase enzymes. Moreover, it is understood that variantsdo not include any of the nucleic acid sequences that may have beendisclosed prior to the invention.

As used herein, the term “hybridizes under stringent conditions” isintended to describe conditions for hybridization and washing underwhich nucleotide sequences encoding a polypeptide at least about 60-65%homologous to each other typically remain hybridized to each other. Theconditions can be such that sequences at least about 65%, at least about70%, at least about 75%, at least about 80%, at least about 90%, atleast about 95% or more identical to each other remain hybridized to oneanother. Such stringent conditions are known to those skilled in the artand can be found in Current Protocols in Molecular Biology, John Wiley &Sons, N.Y. (1989), 6.3.1-6.3.6, incorporated by reference. One exampleof stringent hybridization conditions are hybridization in 6× sodiumchloride/sodium citrate (SSC) at about 45° C., followed by one or morewashes in 0.2×SSC, 0.1% SDS at 50-65° C. In another non-limitingexample, nucleic acid molecules are allowed to hybridize in 6× sodiumchloride/sodium citrate (SSC) at about 45° C., followed by one or morelow stringency washes in 0.2×SSC/0.1% SDS at room temperature, or by oneor more moderate stringency washes in 0.2×SSC/0.1% SDS at 42° C., orwashed in 0.2×SSC/0.1% SDS at 65° C. for high stringency. In oneembodiment, an isolated nucleic acid molecule that hybridizes understringent conditions to the sequence of SEQ ID NO:17 corresponds to anaturally-occurring nucleic acid molecule. As used herein, a“naturally-occurring” nucleic acid molecule refers to an RNA or DNAmolecule having a nucleotide sequence that occurs in nature (e.g.,encodes a natural protein).

As understood by those of ordinary skill, the exact conditions can bedetermined empirically and depend on ionic strength, temperature and theconcentration of destabilizing agents such as formamide or denaturingagents such as SDS. Other factors considered in determining the desiredhybridization conditions include the length of the nucleic acidsequences, base composition, percent mismatch between the hybridizingsequences and the frequency of occurrence of subsets of the sequenceswithin other non-identical sequences. Thus, equivalent conditions can bedetermined by varying one or more of these parameters while maintaininga similar degree of identity or similarity between the two nucleic acidmolecules.

The present invention also provides isolated nucleic acids that containa single or double stranded fragment or portion that hybridizes understringent conditions to the nucleotide sequence of SEQ ID NO:18 or thecomplement of SEQ ID NO:18. In one embodiment, the nucleic acid consistsof a portion of the nucleotide sequence of SEQ ID NO:18 or thecomplement of SEQ ID NO:18.

It is understood that isolated fragments include any contiguous sequencenot disclosed prior to the invention as well as sequences that aresubstantially the same and which are not disclosed. Accordingly, if afragment is disclosed prior to the present invention, that fragment isnot intended to be encompassed by the invention. When a sequence is notdisclosed prior to the present invention, an isolated nucleic acidfragment is at least about 6, preferably at least about 10, 13, 18, 20,23 or 25 nucleotides, and can be 30, 40, 50, 100, 200, 500 or morenucleotides in length. Nucleotide sequences from about 1517 to 1964 arenot disclosed prior to the invention. Longer fragments, for example, 30or more nucleotides in length, which encode antigenic proteins orpolypeptides described herein are useful.

Furthermore, the invention provides polynucleotides that comprise afragment of the full-length lipase polynucleotides. The fragment can besingle or double-stranded and can comprise DNA or RNA. The fragment canbe derived from either the coding or the non-coding sequence.

In another embodiment an isolated lipase nucleic acid encodes the entirecoding region. Other fragments include nucleotide sequences encoding theamino acid fragments described herein.

Thus, lipase nucleic acid fragments further include sequencescorresponding to the domains described herein, subregions alsodescribed, and specific functional sites. Lipase nucleic acid fragmentsalso include combinations of the domains, segments, and other functionalsites described above. A person of ordinary skill in the art would beaware of the many permutations that are possible.

Where the location of the domains or sites have been predicted bycomputer analysis, one of ordinary sill would appreciate that the aminoacid residues constituting these domains can vary depending on thecriteria used to define the domains.

However, it is understood that a lipase fragment includes any nucleicacid sequence that does not include the entire gene.

The invention also provides lipase nucleic acid fragments that encodeepitope bearing regions of the lipase proteins described herein.

Nucleic acid fragments, according to the present invention, are not tobe construed as encompassing those fragments that may have beendisclosed prior to the invention.

Polynucleotide Uses

The nucleotide sequences of the present invention can be used as a“query sequence” to perform a search against public databases, forexample, to identify other family members or related sequences. Suchsearches can be performed using the NBLAST and XBLAST programs (version2.0) of Altschul et al. (1990) J. Mol. Biol. 215:403-10. BLAST proteinsearches can be performed with the XBLAST program, score=50,wordlength=3 to obtain amino acid sequences homologous to the proteinsof the invention. To obtain gapped alignments for comparison purposes,Gapped BLAST can be utilized as described in Altschul et al. (1997)Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST and GappedBLAST programs, the default parameters of the respective programs (e.g.,XBLAST and NBLAST) can be used. See www.ncbi.nlm.nih.gov.

The nucleic acid fragments of the invention provide probes or primers inassays such as those described below. “Probes” are oligonucleotides thathybridize in a base-specific manner to a complementary strand of nucleicacid. Such probes include polypeptide nucleic acids, as described inNielsen et al. (1991) Science 254:1497-1500. Typically, a probecomprises a region of nucleotide sequence that hybridizes under highlystringent conditions to at least about 15, typically about 20-25, andmore typically about 40, 50 or 75 consecutive nucleotides of the nucleicacid sequence shown in SEQ ID NO:18 and the complements thereof. Moretypically, the probe further comprises a label, e.g., radioisotope,fluorescent compound, enzyme, or enzyme co-factor.

As used herein, the term “primer” refers to a single-strandedoligonucleotide which acts as a point of initiation of template-directedDNA synthesis using well-known methods (e.g., PCR, LCR) including, butnot limited to those described herein. The appropriate length of theprimer depends on the particular use, but typically ranges from about 15to 30 nucleotides. The term “primer site” refers to the area of thetarget DNA to which a primer hybridizes. The term “primer pair” refersto a set of primers including a 5′ (upstream) primer that hybridizeswith the 5′ end of the nucleic acid sequence to be amplified and a 3′(downstream) primer that hybridizes with the complement of the sequenceto be amplified.

The lipase polynucleotides are thus useful for probes, primers, and inbiological assays.

Where the polynucleotides are used to assess lipase properties orfunctions, such as in the assays described herein, all or less than allof the entire cDNA can be useful. Assays specifically directed to lipasefunctions, such as assessing agonist or antagonist activity, encompassthe use of known fragments. Further, diagnostic methods for assessinglipase function can also be practiced with any fragment, including thosefragments that may have been known prior to the invention. Similarly, inmethods involving treatment of lipase dysfunction, all fragments areencompassed including those, which may have been known in the art.

The lipase polynucleotides are useful as a hybridization probe for cDNAand genomic DNA to isolate a full-length cDNA and genomic clonesencoding the polypeptide described in SEQ ID NO:17 and to isolate cDNAand genomic clones that correspond to variants producing the samepolypeptide shown in SEQ ID NO:17 or the other variants describedherein. Variants can be isolated from the same tissue and organism fromwhich the polypeptides shown in SEQ ID NO:17 were isolated, differenttissues from the same organism, or from different organisms. This methodis useful for isolating genes and cDNA that aredevelopmentally-controlled and therefore may be expressed in the sametissue or different tissues at different points in the development of anorganism.

The probe can correspond to any sequence along the entire length of thegene encoding the lipase. Accordingly, it could be derived from 5′noncoding regions, the coding region, and 3′ noncoding regions.

The nucleic acid probe can be, for example, the full-length cDNA of SEQID NO:18 or a fragment thereof that is sufficient to specificallyhybridize under stringent conditions to mRNA or DNA.

Fragments of the polynucleotides described herein are also useful tosynthesize larger fragments or full-length polynucleotides describedherein. For example, a fragment can be hybridized to any portion of anmRNA and a larger or full-length cDNA can be produced.

The fragments are also useful to synthesize antisense molecules ofdesired length and sequence.

Antisense nucleic acids of the invention can be designed using thenucleotide sequence of SEQ ID NO:18, and constructed using chemicalsynthesis and enzymatic ligation reactions using procedures known in theart. For example, an antisense nucleic acid (e.g., an antisenseoligonucleotide) can be chemically synthesized using naturally occurringnucleotides or variously modified nucleotides designed to increase thebiological stability of the molecules or to increase the physicalstability of the duplex formed between the antisense and sense nucleicacids, e.g., phosphorothioate derivatives and acridine substitutednucleotides can be used. Examples of modified nucleotides which can beused to generate the antisense nucleic acid include 5-fluorouracil,5-bromouracil, 5-chlorouracil, 5-iodouracil, hypoxanthine, xanthine,4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil,5-carboxymethylaminomethyl-2-thiouridine,5-carboxymethylaminomethyluracil, dihydrouracil,beta-D-galactosylqueosine, inosine, N6-isopentenyladenine,1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine,2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine,7-methylguanine, 5-methylaminomethyluracil,5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine,5′-methoxycarboxymethyluracil, 5-methoxyuracil,2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v),wybutoxosine, pseudouracil, queosine, 2-thiocytosine,5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil,uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v),5-methyl-2-thiouracil, 3-(3-amino-3-N2-carboxypropyl) uracil, (acp3)w,and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can beproduced biologically using an expression vector into which a nucleicacid has been subcloned in an antisense orientation (i.e., RNAtranscribed from the inserted nucleic acid will be of an antisenseorientation to a target nucleic acid of interest).

Additionally, the nucleic acid molecules of the invention can bemodified at the base moiety, sugar moiety or phosphate backbone toimprove, e.g., the stability, hybridization, or solubility of themolecule. For example, the deoxyribose phosphate backbone of the nucleicacids can be modified to generate peptide nucleic acids (see Hyrup etal. (1996) Bioorganic & Medicinal Chemistry 4:5). As used herein, theterms “peptide nucleic acids” or “PNAs” refer to nucleic acid mimics,e.g., DNA mimics, in which the deoxyribose phosphate backbone isreplaced by a pseudopeptide backbone and only the four naturalnucleobases are retained. The neutral backbone of PNAs has been shown toallow for specific hybridization to DNA and RNA under conditions of lowionic strength. The synthesis of PNA oligomers can be performed usingstandard solid phase peptide synthesis protocols as described in Hyrupet al. (1996), supra; Perry-O'Keefe et al. (1996) Proc. Natl. Acad. Sci.USA 93:14670. PNAs can be further modified, e.g., to enhance theirstability, specificity or cellular uptake, by attaching lipophilic orother helper groups to PNA, by the formation of PNA-DNA chimeras, or bythe use of liposomes or other techniques of drug delivery known in theart. The synthesis of PNA-DNA chimeras can be performed as described inHyrup (1996), supra, Finn et al. (1996) Nucleic Acids Res.24(17):3357-63, Mag et al. (1989) Nucleic Acids Res. 17:5973, andPeterser et al. (1975) Bioorganic Med. Chem. Lett. 5:1119.

The nucleic acid molecules and fragments of the invention can alsoinclude other appended groups such as peptides (e.g., for targeting hostcell lipases in vivo), or agents facilitating transport across the cellmembrane (see, e.g., Letsinger et al. (1989) Proc. Natl. Acad. Sci. USA86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad. Sci. USA84:648-652; PCT Publication No. WO 88/0918) or the blood brain barrier(see, e.g., PCT Publication No. WO 89/10134). In addition,oligonucleotides can be modified with hybridization-triggered cleavageagents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) orintercalating agents (see, e.g., Zon (1988) Pharm Res. 5:539-549).

The lipase polynucleotides are also useful as primers for PCR to amplifyany given region of a lipase polynucleotide.

The lipase polynucleotides are also useful for constructing recombinantvectors. Such vectors include expression vectors that express a portionof, or all of, the lipase polypeptides. Vectors also include insertionvectors, used to integrate into another polynucleotide sequence, such asinto the cellular genome, to alter in situ expression of lipase genesand gene products. For example, an endogenous lipase coding sequence canbe replaced via homologous recombination with all or part of the codingregion containing one or more specifically introduced mutations.

The lipase polynucleotides are also useful for expressing antigenicportions of the lipase proteins.

The lipase polynucleotides are also useful as probes for determining thechromosomal positions of the lipase polynucleotides by means of in situhybridization methods, such as FISH. (For a review of this technique,see Verma et al. (1988) Human Chromosomes: A Manual of Basic Techniques(Pergamon Press, New York), and PCR mapping of somatic cell hybrids. Themapping of the sequences to chromosomes is an important first step incorrelating these sequences with genes associated with disease.

Reagents for chromosome mapping can be used individually to mark asingle chromosome or a single site on that chromosome, or panels ofreagents can be used for marking multiple sites and/or multiplechromosomes. Reagents corresponding to noncoding regions of the genesactually are preferred for mapping purposes. Coding sequences are morelikely to be conserved within gene families, thus increasing the chanceof cross hybridizations during chromosomal mapping.

Once a sequence has been mapped to a precise chromosomal location, thephysical position of the sequence on the chromosome can be correlatedwith genetic map data. (Such data are found, for example, in V.McKusick, Mendelian Inheritance in Man, available on-line through JohnsHopkins University Welch Medical Library). The relationship between agene and a disease mapped to the same chromosomal region, can then beidentified through linkage analysis (co-inheritance of physicallyadjacent genes), described in, for example, Egeland et al. ((1987)Nature 325:783-787).

Moreover, differences in the DNA sequences between individuals affectedand unaffected with a disease associated with a specified gene, can bedetermined. If a mutation is observed in some or all of the affectedindividuals but not in any unaffected individuals, then the mutation islikely to be the causative agent of the particular disease. Comparisonof affected and unaffected individuals generally involves first lookingfor structural alterations in the chromosomes, such as deletions ortranslocations, that are visible from chromosome spreads, or detectableusing PCR based on that DNA sequence. Ultimately, complete sequencing ofgenes from several individuals can be performed to confirm the presenceof a mutation and to distinguish mutations from polymorphisms.

The lipase polynucleotide probes are also useful to determine patternsof the presence of the gene encoding the lipase and their variants withrespect to tissue distribution, for example, whether gene duplicationhas occurred and whether the duplication occurs in all or only a subsetof tissues. The genes can be naturally occurring or can have beenintroduced into a cell, tissue, or organism exogenously.

The lipase polynucleotides are also useful for designing ribozymescorresponding to all, or a part, of the mRNA produced from genesencoding the polynucleotides described herein.

The lipase polynucleotides are also useful for constructing host cellsexpressing a part, or all, of the lipase polynucleotides andpolypeptides.

The lipase polynucleotides are also useful for constructing transgenicanimals expressing all, or a part, of the lipase polynucleotides andpolypeptides.

The lipase polynucleotides are also useful for making vectors thatexpress part, or all, of the lipase polypeptides.

The lipase polynucleotides are also useful as hybridization probes fordetermining the level of lipase nucleic acid expression. Accordingly,the probes can be used to detect the presence of, or to determine levelsof, lipase nucleic acid in cells, tissues, and in organisms. The nucleicacid whose level is determined can be DNA or RNA. Accordingly, probescorresponding to the polypeptides described herein can be used to assessgene copy number in a given cell, tissue, or organism. This isparticularly relevant in cases in which there has been an amplificationof the lipase genes.

Alternatively, the probe can be used in an in situ hybridization contextto assess the position of extra copies of the lipase genes, as onextrachromosomal elements or as integrated into chromosomes in which thelipase gene is not normally found, for example as a homogeneouslystaining region.

These uses are relevant for diagnosis of disorders involving an increaseor decrease in lipase expression relative to normal, such as adevelopmental or a metabolic disorder.

Tissues and/or cells in which the lipase is expressed include, but arenot limited to those shown in FIGS. 43, 44, and 45. Such tissues/cellsinclude liver, fetal liver, breast, brain, fetal kidney, and testis.Moderate expression occurs in prostate, skeletal muscle, colon, kidney,and thyroid. Lower positive expression occurs in heart, fetal heart,small intestine, spleen, lung, ovary, vein, aorta, placenta,osteoblasts, cervix, esophagus, thymus, tonsil, and lymph node. Thelipase is also expressed in malignant breast, lung, and colon tissue andin liver metastases derived from malignant colonic tissues. Hence, thelipase is relevant to disorders involving the tissues in which it isexpressed. As such, the gene is particularly relevant for the treatmentof disorders involving breast, lung, liver, and colon cancer. Disordersin which the lipase expression is relevant include, but are not limitedto those disclosed herein above.

Thus, the present invention provides a method for identifying a diseaseor disorder associated with aberrant expression or activity of lipasenucleic acid, in which a test sample is obtained from a subject andnucleic acid (e.g., mRNA, genomic DNA) is detected, wherein the presenceof the nucleic acid is diagnostic for a subject having or at risk ofdeveloping a disease or disorder associated with aberrant expression oractivity of the nucleic acid.

One aspect of the invention relates to diagnostic assays for determiningnucleic acid expression as well as activity in the context of abiological sample (e.g., blood, serum, cells, tissue) to determinewhether an individual has a disease or disorder, or is at risk ofdeveloping a disease or disorder, associated with aberrant nucleic acidexpression or activity. Such assays can be used for prognostic orpredictive purpose to thereby prophylactically treat an individual priorto the onset of a disorder characterized by or associated withexpression or activity of the nucleic acid molecules.

In vitro techniques for detection of mRNA include Northernhybridizations and in situ hybridizations. In vitro techniques fordetecting DNA includes Southern hybridizations and in situhybridization.

Probes can be used as a part of a diagnostic test kit for identifyingcells or tissues that express the lipase, such as by measuring the levelof a lipase-encoding nucleic acid in a sample of cells from a subjecte.g., mRNA or genomic DNA, or determining if the lipase gene has beenmutated.

Nucleic acid expression assays are useful for drug screening to identifycompounds that modulate lipase nucleic acid expression (e.g., antisense,polypeptides, peptidomimetics, small molecules or other drugs). A cellis contacted with a candidate compound and the expression of mRNAdetermined. The level of expression of the mRNA in the presence of thecandidate compound is compared to the level of expression of the mRNA inthe absence of the candidate compound. The candidate compound can thenbe identified as a modulator of nucleic acid expression based on thiscomparison and be used, for example to treat a disorder characterized byaberrant nucleic acid expression. The modulator can bind to the nucleicacid or indirectly modulate expression, such as by interacting withother cellular components that affect nucleic acid expression.

Modulatory methods can be performed in vitro (e.g., by culturing thecell with the agent) or, alternatively, in vivo (e.g., by administeringthe gent to a subject) in patients or in transgenic animals.

The invention thus provides a method for identifying a compound that canbe used to treat a disorder associated with nucleic acid expression ofthe lipase gene. The method typically includes assaying the ability ofthe compound to modulate the expression of the lipase nucleic acid andthus identifying a compound that can be used to treat a disordercharacterized by undesired lipase nucleic acid expression.

The assays can be performed in cell-based and cell-free systems.Cell-based assays include cells naturally expressing the lipase nucleicacid or recombinant cells genetically engineered to express specificnucleic acid sequences.

Alternatively, candidate compounds can be assayed in vivo in patients orin transgenic animals.

The assay for lipase nucleic acid expression can involve direct assay ofnucleic acid levels, such as mRNA levels, or on collateral compoundsinvolved in the pathway. Further, the expression of genes that are up-or down-regulated in response to the lipase activity can also beassayed. In this embodiment the regulatory regions of these genes can beoperably linked to a reporter gene such as luciferase.

Thus, modulators of lipase gene expression can be identified in a methodwherein a cell is contacted with a candidate compound and the expressionof mRNA determined. The level of expression of lipase mRNA in thepresence of the candidate compound is compared to the level ofexpression of lipase mRNA in the absence of the candidate compound. Thecandidate compound can then be identified as a modulator of nucleic acidexpression based on this comparison and be used, for example to treat adisorder characterized by aberrant nucleic acid expression. Whenexpression of mRNA is statistically significantly greater in thepresence of the candidate compound than in its absence, the candidatecompound is identified as a stimulator of nucleic acid expression. Whennucleic acid expression is statistically significantly less in thepresence of the candidate compound than in its absence, the candidatecompound is identified as an inhibitor of nucleic acid expression.

Accordingly, the invention provides methods of treatment, with thenucleic acid as a target, using a compound identified through drugscreening as a gene modulator to modulate lipase nucleic acidexpression. Modulation includes both up-regulation (i.e. activation oragonization) or down-regulation (suppression or antagonization) oreffects on nucleic acid activity (e.g. when nucleic acid is mutated orimproperly modified). Treatment includes disorders characterized byaberrant expression or activity of the nucleic acid. In addition,disorders that are influenced by the lipase may also be treated.Examples of such disorders are disclosed herein.

Alternatively, a modulator for lipase nucleic acid expression can be asmall molecule or drug identified using the screening assays describedherein as long as the drug or small molecule inhibits the lipase nucleicacid expression.

The lipase polynucleotides are also useful for monitoring theeffectiveness of modulating compounds on the expression or activity ofthe lipase gene in clinical trials or in a treatment regimen. Thus, thegene expression pattern can serve as a barometer for the continuingeffectiveness of treatment with the compound, particularly withcompounds to which a patient can develop resistance. The gene expressionpattern can also serve as a marker indicative of a physiologicalresponse of the affected cells to the compound. Accordingly, suchmonitoring would allow either increased administration of the compoundor the administration of alternative compounds to which the patient hasnot become resistant. Similarly, if the level of nucleic acid expressionfalls below a desirable level, administration of the compound could becommensurately decreased.

Monitoring can be, for example, as follows: (i) obtaining apre-administration sample from a subject prior to administration of theagent; (ii) detecting the level of expression of a specified mRNA orgenomic DNA of the invention in the pre-administration sample; (iii)obtaining one or more post-administration samples from the subject; (iv)detecting the level of expression or activity of the mRNA or genomic DNAin the post-administration samples; (v) comparing the level ofexpression or activity of the mRNA or genomic DNA in thepre-administration sample with the mRNA or genomic DNA in thepost-administration sample or samples; and (vi) increasing or decreasingthe administration of the agent to the subject accordingly.

The lipase polynucleotides are also useful in diagnostic assays forqualitative changes in lipase nucleic acid, and particularly inqualitative changes that lead to pathology. The polynucleotides can beused to detect mutations in lipase genes and gene expression productssuch as mRNA. The polynucleotides can be used as hybridization probes todetect naturally-occurring genetic mutations in the lipase gene andthereby to determine whether a subject with the mutation is at risk fora disorder caused by the mutation. Mutations include deletion, addition,or substitution of one or more nucleotides in the gene, chromosomalrearrangement, such as inversion or transposition, modification ofgenomic DNA, such as aberrant methylation patterns or changes in genecopy number, such as amplification. Detection of a mutated form of thelipase gene associated with a dysfunction provides a diagnostic tool foran active disease or susceptibility to disease when the disease resultsfrom overexpression, underexpression, or altered expression of a lipase.

Mutations in the lipase gene can be detected at the nucleic acid levelby a variety of techniques. Genomic DNA can be analyzed directly or canbe amplified by using PCR prior to analysis. RNA or cDNA can be used inthe same way.

In certain embodiments, detection of the mutation involves the use of aprobe/primer in a polymerase chain reaction (PCR) (see, e.g. U.S. Pat.Nos. 4,683,195 and 4,683,202), such as anchor PCR or RACE PCR, or,alternatively, in a ligation chain reaction (LCR) (see, e.g., Landegranet al. (1988) Science 241:1077-1080; and Nakazawa et al. (1994) PNAS91:360-364), the latter of which can be particularly useful fordetecting point mutations in the gene (see Abravaya et al. (1995)Nucleic Acids Res. 23:675-682). This method can include the steps ofcollecting a sample of cells from a patient, isolating nucleic acid(e.g., genomic, mRNA or both) from the cells of the sample, contactingthe nucleic acid sample with one or more primers which specificallyhybridize to a gene under conditions such that hybridization andamplification of the gene (if present) occurs, and detecting thepresence or absence of an amplification product, or detecting the sizeof the amplification product and comparing the length to a controlsample. Deletions and insertions can be detected by a change in size ofthe amplified product compared to the normal genotype. Point mutationscan be identified by hybridizing amplified DNA to normal RNA orantisense DNA sequences.

It is anticipated that PCR and/or LCR may be desirable to use as apreliminary amplification step in conjunction with any of the techniquesused for detecting mutations described herein.

Alternative amplification methods include: self sustained sequencereplication (Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA87:1874-1878), transcriptional amplification system (Kwoh et al. (1989)Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi etal. (1988) Bio/Technology 6:1197), or any other nucleic acidamplification method, followed by the detection of the amplifiedmolecules using techniques well-known to those of skill in the art.These detection schemes are especially useful for the detection ofnucleic acid molecules if such molecules are present in very lownumbers.

Alternatively, mutations in a lipase gene can be directly identified,for example, by alterations in restriction enzyme digestion patternsdetermined by gel electrophoresis.

Further, sequence-specific ribozymes (U.S. Pat. No. 5,498,531) can beused to score for the presence of specific mutations by development orloss of a ribozyme cleavage site.

Perfectly matched sequences can be distinguished from mismatchedsequences by nuclease cleavage digestion assays or by differences inmelting temperature.

Sequence changes at specific locations can also be assessed by nucleaseprotection assays such as RNase and S1 protection or the chemicalcleavage method.

Furthermore, sequence differences between a mutant lipase gene and awild-type gene can be determined by direct DNA sequencing. A variety ofautomated sequencing procedures can be utilized when performing thediagnostic assays ((1995) Biotechniques 19:448), including sequencing bymass spectrometry (see, e.g., PCT International Publication No. WO94/16101; Cohen et al. (1996) Adv. Chromatogr. 36:127-162; and Griffinet al. (1993) Appl. Biochem. Biotechnol. 38:147-159).

Other methods for detecting mutations in the gene include methods inwhich protection from cleavage agents is used to detect mismatched basesin RNA/RNA or RNA/DNA duplexes (Myers et al. (1985) Science 230:1242);Cotton et al. (1988) PNAS 85:4397; Saleeba et al. (1992) Meth. Enzymol.217:286-295), electrophoretic mobility of mutant and wild type nucleicacid is compared (Orita et al. (1989) PNAS 86:2766; Cotton et al. (1993)Mutat. Res. 285:125-144; and Hayashi et al. (1992) Genet. Anal. Tech.Appl. 9:73-79), and movement of mutant or wild-type fragments inpolyacrylamide gels containing a gradient of denaturant is assayed usingdenaturing gradient gel electrophoresis (Myers et al. (1985) Nature313:495). The sensitivity of the assay may be enhanced by using RNA(rather than DNA), in which the secondary structure is more sensitive toa change in sequence. In one embodiment, the subject method utilizesheteroduplex analysis to separate double stranded heteroduplex moleculeson the basis of changes in electrophoretic mobility (Keen et al. (1991)Trends Genet. 7:5). Examples of other techniques for detecting pointmutations include, selective oligonucleotide hybridization, selectiveamplification, and selective primer extension.

In other embodiments, genetic mutations can be identified by hybridizinga sample and control nucleic acids, e.g., DNA or RNA, to high densityarrays containing hundreds or thousands of oligonucleotide probes(Cronin et al. (1996) Human Mutation 7:244-255; Kozal et al. (1996)Nature Medicine 2:753-759). For example, genetic mutations can beidentified in two-dimensional arrays containing light-generated DNAprobes as described in Cronin et al. supra. Briefly, a firsthybridization array of probes can be used to scan through long stretchesof DNA in a sample and control to identify base changes between thesequences by making linear arrays of sequential overlapping probes. Thisstep allows the identification of point mutations. This step is followedby a second hybridization array that allows the characterization ofspecific mutations by using smaller, specialized probe arrayscomplementary to all variants or mutations detected. Each mutation arrayis composed of parallel probe sets, one complementary to the wild-typegene and the other complementary to the mutant gene.

The lipase polynucleotides are also useful for testing an individual fora genotype that while not necessarily causing the disease, neverthelessaffects the treatment modality. Thus, the polynucleotides can be used tostudy the relationship between an individual's genotype and theindividual's response to a compound used for treatment (pharmacogenomicrelationship). Accordingly, the lipase polynucleotides described hereincan be used to assess the mutation content of the gene in an individualin order to select an appropriate compound or dosage regimen fortreatment.

Thus polynucleotides displaying genetic variations that affect treatmentprovide a diagnostic target that can be used to tailor treatment in anindividual. Accordingly, the production of recombinant cells and animalscontaining these polymorphisms allow effective clinical design oftreatment compounds and dosage regimens.

The methods can involve obtaining a control biological sample from acontrol subject, contacting the control sample with a compound or agentcapable of detecting mRNA, or genomic DNA, such that the presence ofmRNA or genomic DNA is detected in the biological sample, and comparingthe presence of mRNA or genomic DNA in the control sample with thepresence of mRNA or genomic DNA in the test sample.

The lipase polynucleotides are also useful for chromosome identificationwhen the sequence is identified with an individual chromosome and to aparticular location on the chromosome. First, the DNA sequence ismatched to the chromosome by in situ or other chromosome-specifichybridization. Sequences can also be correlated to specific chromosomesby preparing PCR primers that can be used for PCR screening of somaticcell hybrids containing individual chromosomes from the desired species.Only hybrids containing the chromosome containing the gene homologous tothe primer will yield an amplified fragment. Sublocalization can beachieved using chromosomal fragments. Other strategies includeprescreening with labeled flow-sorted chromosomes and preselection byhybridization to chromosome-specific libraries. Further mappingstrategies include fluorescence in situ hybridization, which allowshybridization with probes shorter than those traditionally used.Reagents for chromosome mapping can be used individually to mark asingle chromosome or a single site on the chromosome, or panels ofreagents can be used for marking multiple sites and/or multiplechromosomes. Reagents corresponding to noncoding regions of the genesactually are preferred for mapping purposes. Coding sequences are morelikely to be conserved within gene families, thus increasing the chanceof cross hybridizations during chromosomal mapping.

The lipase polynucleotides can also be used to identify individualsbased on small biological samples. This can be done for example usingrestriction fragment-length polymorphism (RFLP) to identify anindividual. Thus, the polynucleotides described herein are useful as DNAmarkers for RFLP (See U.S. Pat. No. 5,272,057).

Furthermore, the lipase sequence can be used to provide an alternativetechnique, which determines the actual DNA sequence of selectedfragments in the genome of an individual. Thus, the lipase sequencesdescribed herein can be used to prepare two PCR primers from the 5′ and3′ ends of the sequences. These primers can then be used to amplify DNAfrom an individual for subsequent sequencing.

Panels of corresponding DNA sequences from individuals prepared in thismanner can provide unique individual identifications, as each individualwill have a unique set of such DNA sequences. It is estimated thatallelic variation in humans occurs with a frequency of about once pereach 500 bases. Allelic variation occurs to some degree in the codingregions of these sequences, and to a greater degree in the noncodingregions. The lipase sequences can be used to obtain such identificationsequences from individuals and from tissue. The sequences representunique fragments of the human genome. Each of the sequences describedherein can, to some degree, be used as a standard against which DNA froman individual can be compared for identification purposes.

If a panel of reagents from the sequences is used to generate a uniqueidentification database for an individual, those same reagents can laterbe used to identify tissue from that individual. Using the uniqueidentification database, positive identification of the individual,living or dead, can be made from extremely small tissue samples.

The lipase polynucleotides can also be used in forensic identificationprocedures. PCR technology can be used to amplify DNA sequences takenfrom very small biological samples, such as a single hair follicle, bodyfluids (e.g. blood, saliva, or semen). The amplified sequence can thenbe compared to a standard allowing identification of the origin of thesample.

The lipase polynucleotides can thus be used to provide polynucleotidereagents, e.g., PCR primers, targeted to specific loci in the humangenome, which can enhance the reliability of DNA-based forensicidentifications by, for example, providing another “identificationmarker” (i.e. another DNA sequence that is unique to a particularindividual). As described above, actual base sequence information can beused for identification as an accurate alternative to patterns formed byrestriction enzyme generated fragments. Sequences targeted to thenoncoding region are particularly useful since greater polymorphismoccurs in the noncoding regions, making it easier to differentiateindividuals using this technique.

The lipase polynucleotides can further be used to provide polynucleotidereagents, e.g., labeled or labelable probes which can be used in, forexample, an in situ hybridization technique, to identify a specifictissue. This is useful in cases in which a forensic pathologist ispresented with a tissue of unknown origin. Panels of lipase probes canbe used to identify tissue by species and/or by organ type.

In a similar fashion, these primers and probes can be used to screentissue culture for contamination (i.e., screen for the presence of amixture of different types of cells in a culture).

Alternatively, the lipase polynucleotides can be used directly to blocktranscription or translation of lipase gene sequences by means ofantisense or ribozyme constructs. Thus, in a disorder characterized byabnormally high or undesirable lipase gene expression, nucleic acids canbe directly used for treatment.

The lipase polynucleotides are thus useful as antisense constructs tocontrol lipase gene expression in cells, tissues, and organisms. A DNAantisense polynucleotide is designed to be complementary to a region ofthe gene involved in transcription, preventing transcription and henceproduction of lipase protein. An antisense RNA or DNA polynucleotidewould hybridize to the mRNA and thus block translation of mRNA intolipase protein.

Examples of antisense molecules useful to inhibit nucleic acidexpression include antisense molecules complementary to a fragment ofthe 5′ untranslated region of SEQ ID NO:18 which also includes the startcodon and antisense molecules which are complementary to a fragment ofthe 3′ untranslated region of SEQ ID NO:18.

Alternatively, a class of antisense molecules can be used to inactivatemRNA in order to decrease expression of lipase nucleic acid.Accordingly, these molecules can treat a disorder characterized byabnormal or undesired lipase nucleic acid expression. This techniqueinvolves cleavage by means of ribozymes containing nucleotide sequencescomplementary to one or more regions in the mRNA that attenuate theability of the mRNA to be translated. Possible regions include codingregions and particularly coding regions corresponding to the catalyticand other functional activities of the lipase protein.

The lipase polynucleotides also provide vectors for gene therapy inpatients containing cells that are aberrant in lipase gene expression.Thus, recombinant cells, which include the patient's cells that havebeen engineered ex vivo and returned to the patient, are introduced intoan individual where the cells produce the desired lipase protein totreat the individual.

The invention also encompasses kits for detecting the presence of alipase nucleic acid in a biological sample. For example, the kit cancomprise reagents such as a labeled or labelable nucleic acid or agentcapable of detecting lipase nucleic acid in a biological sample; meansfor determining the amount of lipase nucleic acid in the sample; andmeans for comparing the amount of lipase nucleic acid in the sample witha standard. The compound or agent can be packaged in a suitablecontainer. The kit can further comprise instructions for using the kitto detect lipase mRNA or DNA.

Computer Readable Means

The nucleotide or amino acid sequences of the invention are alsoprovided in a variety of mediums to facilitate use thereof. As usedherein, “provided” refers to a manufacture, other than an isolatednucleic acid or amino acid molecule, which contains a nucleotide oramino acid sequence of the present invention. Such a manufactureprovides the nucleotide or amino acid sequences, or a subset thereof(e.g., a subset of open reading frames (ORFs)) in a form which allows askilled artisan to examine the manufacture using means not directlyapplicable to examining the nucleotide or amino acid sequences, or asubset thereof, as they exists in nature or in purified form.

In one application of this embodiment, a nucleotide or amino acidsequence of the present invention can be recorded on computer readablemedia. As used herein, “computer readable media” refers to any mediumthat can be read and accessed directly by a computer. Such mediainclude, but are not limited to: magnetic storage media, such as floppydiscs, hard disc storage medium, and magnetic tape; optical storagemedia such as CD-ROM; electrical storage media such as RAM and ROM; andhybrids of these categories such as magnetic/optical storage media. Theskilled artisan will readily appreciate how any of the presently knowncomputer readable mediums can be used to create a manufacture comprisingcomputer readable medium having recorded thereon a nucleotide or aminoacid sequence of the present invention.

As used herein, “recorded” refers to a process for storing informationon computer readable medium. The skilled artisan can readily adopt anyof the presently known methods for recording information on computerreadable medium to generate manufactures comprising the nucleotide oramino acid sequence information of the present invention.

A variety of data storage structures are available to a skilled artisanfor creating a computer readable medium having recorded thereon anucleotide or amino acid sequence of the present invention. The choiceof the data storage structure will generally be based on the meanschosen to access the stored information. In addition, a variety of dataprocessor programs and formats can be used to store the nucleotidesequence information of the present invention on computer readablemedium. The sequence information can be represented in a word processingtext file, formatted in commercially-available software such asWordPerfect and Microsoft Word, or represented in the form of an ASCIIfile, stored in a database application, such as DB2, Sybase, Oracle, orthe like. The skilled artisan can readily adapt any number ofdataprocessor structuring formats (e.g., text file or database) in orderto obtain computer readable medium having recorded thereon thenucleotide sequence information of the present invention.

By providing the nucleotide or amino acid sequences of the invention incomputer readable form, the skilled artisan can routinely access thesequence information for a variety of purposes. For example, one skilledin the art can use the nucleotide or amino acid sequences of theinvention in computer readable form to compare a target sequence ortarget structural motif with the sequence information stored within thedata storage means. Search means are used to identify fragments orregions of the sequences of the invention which match a particulartarget sequence or target motif.

As used herein, a “target sequence” can be any DNA or amino acidsequence of six or more nucleotides or two or more amino acids. Askilled artisan can readily recognize that the longer a target sequenceis, the less likely a target sequence will be present as a randomoccurrence in the database. The most preferred sequence length of atarget sequence is from about 10 to 100 amino acids or from about 30 to300 nucleotide residues. However, it is well recognized thatcommercially important fragments, such as sequence fragments involved ingene expression and protein processing, may be of shorter length.

As used herein, “a target structural motif,” or “target motif,” refersto any rationally selected sequence or combination of sequences in whichthe sequence(s) are chosen based on a three-dimensional configurationwhich is formed upon the folding of the target motif. There are avariety of target motifs known in the art. Protein target motifsinclude, but are not limited to, enzyme active sites and signalsequences. Nucleic acid target motifs include, but are not limited to,promoter sequences, hairpin structures and inducible expression elements(protein binding sequences).

Computer software is publicly available which allows a skilled artisanto access sequence information provided in a computer readable mediumfor analysis and comparison to other sequences. A variety of knownalgorithms are disclosed publicly and a variety of commerciallyavailable software for conducting search means are and can be used inthe computer-based systems of the present invention. Examples of suchsoftware include, but are not limited to, MacPattern (EMBL), BLASTN andBLASTX (NCBIA).

For example, software which implements the BLAST (Altschul et al. (1990)J. Mol. Biol. 215:403-410) and BLAZE (Brutlag et al. (1993) Comp. Chem.17:203-207) search algorithms on a Sybase system can be used to identifyopen reading frames (ORFs) of the sequences of the invention whichcontain homology to ORFs or proteins from other libraries. Such ORFs areprotein encoding fragments and are useful in producing commerciallyimportant proteins such as enzymes used in various reactions and in theproduction of commercially useful metabolites.

Vectors/Host Cells

The invention also provides vectors containing the lipasepolynucleotides. The term “vector” refers to a vehicle, preferably anucleic acid molecule that can transport the lipase polynucleotides.When the vector is a nucleic acid molecule, the lipase polynucleotidesare covalently linked to the vector nucleic acid. With this aspect ofthe invention, the vector includes a plasmid, single or double strandedphage, a single or double stranded RNA or DNA viral vector, orartificial chromosome, such as a BAC, PAC, YAC, OR MAC.

A vector can be maintained in the host cell as an extrachromosomalelement where it replicates and produces additional copies of the lipasepolynucleotides. Alternatively, the vector may integrate into the hostcell genome and produce additional copies of the lipase polynucleotideswhen the host cell replicates.

The invention provides vectors for the maintenance (cloning vectors) orvectors for expression (expression vectors) of the lipasepolynucleotides. The vectors can function in procaryotic or eukaryoticcells or in both (shuttle vectors).

Expression vectors contain cis-acting regulatory regions that areoperably linked in the vector to the lipase polynucleotides such thattranscription of the polynucleotides is allowed in a host cell. Thepolynucleotides can be introduced into the host cell with a separatepolynucleotide capable of affecting transcription. Thus, the secondpolynucleotide may provide a trans-acting factor interacting with thecis-regulatory control region to allow transcription of the lipasepolynucleotides from the vector. Alternatively, a trans-acting factormay be supplied by the host cell. Finally, a trans-acting factor can beproduced from the vector itself.

It is understood, however, that in some embodiments, transcriptionand/or translation of the lipase polynucleotides can occur in acell-free system.

The regulatory sequence to which the polynucleotides described hereincan be operably linked include promoters for directing mRNAtranscription. These include, but are not limited to, the left promoterfrom bacteriophage λ, the lac, TRP, and TAC promoters from E. coli, theearly and late promoters from SV40, the CMV immediate early promoter,the adenovirus early and late promoters, and retrovirus long-terminalrepeats.

In addition to control regions that promote transcription, expressionvectors may also include regions that modulate transcription, such asrepressor binding sites and enhancers. Examples include the SV40enhancer, the cytomegalovirus immediate early enhancer, polyomaenhancer, adenovirus enhancers, and retrovirus LTR enhancers.

In addition to containing sites for transcription initiation andcontrol, expression vectors can also contain sequences necessary fortranscription termination and, in the transcribed region a ribosomebinding site for translation. Other regulatory control elements forexpression include initiation and termination codons as well aspolyadenylation signals. The person of ordinary skill in the art wouldbe aware of the numerous regulatory sequences that are useful inexpression vectors. Such regulatory sequences are described, forexample, in Sambrook et al. (1989) Molecular Cloning: A LaboratoryManual 2nd. ed., Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y.).

A variety of expression vectors can be used to express a lipasepolynucleotide. Such vectors include chromosomal, episomal, andvirus-derived vectors, for example vectors derived from bacterialplasmids, from bacteriophage, from yeast episomes, from yeastchromosomal elements, including yeast artificial chromosomes, fromviruses such as baculoviruses, papovaviruses such as SV40, Vacciniaviruses, adenoviruses, poxviruses, pseudorabies viruses, andretroviruses. Vectors may also be derived from combinations of thesesources such as those derived from plasmid and bacteriophage geneticelements, e.g. cosmids and phagemids. Appropriate cloning and expressionvectors for prokaryotic and eukaryotic hosts are described in Sambrooket al. (1989) Molecular Cloning: A Laboratory Manual 2nd. ed., ColdSpring Harbor Laboratory Press, Cold Spring Harbor, N.Y.

The regulatory sequence may provide constitutive expression in one ormore host cells (i.e., tissue specific) or may provide for inducibleexpression in one or more cell types such as by temperature, nutrientadditive, or exogenous factor such as a hormone or other ligand. Avariety of vectors providing for constitutive and inducible expressionin prokaryotic and eukaryotic hosts are well known to those of ordinaryskill in the art.

The lipase polynucleotides can be inserted into the vector nucleic acidby well-known methodology. Generally, the DNA sequence that willultimately be expressed is joined to an expression vector by cleavingthe DNA sequence and the expression vector with one or more restrictionenzymes and then ligating the fragments together. Procedures forrestriction enzyme digestion and ligation are well known to those ofordinary skill in the art.

The vector containing the appropriate polynucleotide can be introducedinto an appropriate host cell for propagation or expression usingwell-known techniques. Bacterial cells include, but are not limited to,E. coli, Streptomyces, and Salmonella typhimurium. Eukaryotic cellsinclude, but are not limited to, yeast, insect cells such as Drosophila,animal cells such as COS and CHO cells, and plant cells.

As described herein, it may be desirable to express the polypeptide as afusion protein. Accordingly, the invention provides fusion vectors thatallow for the production of the lipase polypeptides. Fusion vectors canincrease the expression of a recombinant protein, increase thesolubility of the recombinant protein, and aid in the purification ofthe protein by acting for example as a ligand for affinity purification.A proteolytic cleavage site may be introduced at the junction of thefusion moiety so that the desired polypeptide can ultimately beseparated from the fusion moiety. Proteolytic enzymes include, but arenot limited to, factor Xa, thrombin, and enterokinase. Typical fusionexpression vectors include pGEX (Smith et al. (1988) Gene 67:31-40),pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia,Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose Ebinding protein, or protein A, respectively, to the target recombinantprotein. Examples of suitable inducible non-fusion E. coli expressionvectors include pTrc (Amann et al. (1988) Gene 69:301-315) and pET 11d(Studier et al. (1990) Gene Expression Technology: Methods in Enzymology185:60-89).

Recombinant protein expression can be maximized in a host bacteria byproviding a genetic background wherein the host cell has an impairedcapacity to proteolytically cleave the recombinant protein. (Gottesman,S. (1990) Gene Expression Technology: Methods in Enzymology 185,Academic Press, San Diego, Calif. 119-128). Alternatively, the sequenceof the polynucleotide of interest can be altered to provide preferentialcodon usage for a specific host cell, for example E. coli. (Wada et al.(1992) Nucleic Acids Res. 20:2111-2118).

The lipase polynucleotides can also be expressed by expression vectorsthat are operative in yeast. Examples of vectors for expression in yeaste.g., S. cerevisiae include pYepSec1 (Baldari et al. (1987) EMBO J.6:229-234), pMFa (Kurjan et al. (1982) Cell 30:933-943), pJRY88 (Schultzet al. (1987) Gene 54:113-123), and pYES2 (Invitrogen Corporation, SanDiego, Calif.).

The lipase polynucleotides can also be expressed in insect cells using,for example, baculovirus expression vectors. Baculovirus vectorsavailable for expression of proteins in cultured insect cells (e.g., Sf9cells) include the pAc series (Smith et al. (1983) Mol. Cell. Biol.3:2156-2165) and the pVL series (Lucklow et al. (1989) Virology170:31-39).

In certain embodiments of the invention, the polynucleotides describedherein are expressed in mammalian cells using mammalian expressionvectors. Examples of mammalian expression vectors include pCDM8 (Seed,B. (1987) Nature 329:840) and pMT2PC (Kaufman et al. (1987) EMBO J.6:187-195).

The expression vectors listed herein are provided by way of example onlyof the well-known vectors available to those of ordinary skill in theart that would be useful to express the lipase polynucleotides. Theperson of ordinary skill in the art would be aware of other vectorssuitable for maintenance propagation or expression of thepolynucleotides described herein. These are found for example inSambrook et al. (1989) Molecular Cloning: A Laboratory Manual 2nd, ed.,Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, ColdSpring Harbor, N.Y.

The invention also encompasses vectors in which the nucleic acidsequences described herein are cloned into the vector in reverseorientation, but operably linked to a regulatory sequence that permitstranscription of antisense RNA. Thus, an antisense transcript can beproduced to all, or to a portion, of the polynucleotide sequencesdescribed herein, including both coding and non-coding regions.Expression of this antisense RNA is subject to each of the parametersdescribed above in relation to expression of the sense RNA (regulatorysequences, constitutive or inducible expression, tissue-specificexpression).

The invention also relates to recombinant host cells containing thevectors described herein. Host cells therefore include prokaryoticcells, lower eukaryotic cells such as yeast, other eukaryotic cells suchas insect cells, and higher eukaryotic cells such as mammalian cells.

The recombinant host cells are prepared by introducing the vectorconstructs described herein into the cells by techniques readilyavailable to the person of ordinary skill in the art. These include, butare not limited to, calcium phosphate transfection,DEAE-dextran-mediated transfection, cationic lipid-mediatedtransfection, electroporation, transduction, infection, lipofection, andother techniques such as those found in Sambrook et al. (MolecularCloning: A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory,Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.).

Host cells can contain more than one vector. Thus, different nucleotidesequences can be introduced on different vectors of the same cell.Similarly, the lipase polynucleotides can be introduced either alone orwith other polynucleotides that are not related to the lipasepolynucleotides such as those providing trans-acting factors forexpression vectors. When more than one vector is introduced into a cell,the vectors can be introduced independently, co-introduced or joined tothe lipase polynucleotide vector.

In the case of bacteriophage and viral vectors, these can be introducedinto cells as packaged or encapsulated virus by standard procedures forinfection and transduction. Viral vectors can be replication-competentor replication-defective. In the case in which viral replication isdefective, replication will occur in host cells providing functions thatcomplement the defects.

Vectors generally include selectable markers that enable the selectionof the subpopulation of cells that contain the recombinant vectorconstructs. The marker can be contained in the same vector that containsthe polynucleotides described herein or may be on a separate vector.Markers include tetracycline or ampicillin-resistance genes forprokaryotic host cells and dihydrofolate reductase or neomycinresistance for eukaryotic host cells. However, any marker that providesselection for a phenotypic trait will be effective.

While the mature proteins can be produced in bacteria, yeast, mammaliancells, and other cells under the control of the appropriate regulatorysequences, cell-free transcription and translation systems can also beused to produce these proteins using RNA derived from the DNA constructsdescribed herein.

Where secretion of the polypeptide is desired, appropriate secretionsignals are incorporated into the vector. The signal sequence can beendogenous to the lipase polypeptides or heterologous to thesepolypeptides.

Where the polypeptide is not secreted into the medium, the protein canbe isolated from the host cell by standard disruption procedures,including freeze thaw, sonication, mechanical disruption, use of lysingagents and the like. The polypeptide can then be recovered and purifiedby well-known purification methods including ammonium sulfateprecipitation, acid extraction, anion or cationic exchangechromatography, phosphocellulose chromatography, hydrophobic-interactionchromatography, affinity chromatography, hydroxylapatite chromatography,lectin chromatography, or high performance liquid chromatography.

It is also understood that depending upon the host cell in recombinantproduction of the polypeptides described herein, the polypeptides canhave various glycosylation patterns, depending upon the cell, or maybenon-glycosylated as when produced in bacteria. In addition, thepolypeptides may include an initial modified methionine in some cases asa result of a host-mediated process.

Uses of Vectors and Host Cells

It is understood that “host cells” and “recombinant host cells” refernot only to the particular subject cell but also to the progeny orpotential progeny of such a cell. Because certain modifications mayoccur in succeeding generations due to either mutation or environmentalinfluences, such progeny may not, in fact, be identical to the parentcell, but are still included within the scope of the term as usedherein.

The host cells expressing the polypeptides described herein, andparticularly recombinant host cells, have a variety of uses. First, thecells are useful for producing lipase proteins or polypeptides that canbe further purified to produce desired amounts of lipase protein orfragments. Thus, host cells containing expression vectors are useful forpolypeptide production.

Host cells are also useful for conducting cell-based assays involvingthe lipase or lipase fragments. Thus, a recombinant host cell expressinga native lipase is useful to assay for compounds that stimulate orinhibit lipase function. This includes disappearance of substrate(triglycerides, phospholipids, lipoproteins), appearance of end product(fatty acids), and the various other molecular functions describedherein that include, but are not limited to, substrate recognition,substrate binding, subunit association, and interaction with othercellular components. Modulation of gene expression can occur at thelevel of transcription or translation.

Host cells are also useful for identifying lipase mutants in which thesefunctions are affected. If the mutants naturally occur and give rise toa pathology, host cells containing the mutations are useful to assaycompounds that have a desired effect on the mutant lipase (for example,stimulating or inhibiting function) which may not be indicated by theireffect on the native lipase.

Recombinant host cells are also useful for expressing the chimericpolypeptides described herein to assess compounds that activate orsuppress activation or alter specific function by means of aheterologous domain, segment, site, and the like, as disclosed herein.

Further, mutant lipase can be designed in which one or more of thevarious functions is engineered to be increased or decreased, forexample, substrate binding activity or the catalytic activity of thelipase, and used to augment or replace lipase proteins in an individual.Thus, host cells can provide a therapeutic benefit by replacing anaberrant lipase or providing an aberrant lipase that provides atherapeutic result. In one embodiment, the cells provide lipase that areabnormally active.

In another embodiment, the cells provide lipase that are abnormallyinactive. These lipases can compete with endogenous lipase polypeptidesin the individual.

In another embodiment, cells expressing lipase that cannot be activated,are introduced into an individual in order to compete with endogenouslipases for its various substrates. For example, in the case in whichexcessive lipase or analog is part of a treatment modality, it may benecessary to inactivate this molecule at a specific point in treatment.Providing cells that compete for the molecule, but which cannot beaffected by lipase activation would be beneficial.

Homologously recombinant host cells can also be produced that allow thein situ alteration of endogenous lipase polynucleotide sequences in ahost cell genome. The host cell includes, but is not limited to, astable cell line, cell in vivo, or cloned microorganism. This technologyis more fully described in WO 93/09222, WO 91/12650, WO 91/06667, U.S.Pat. No. 5,272,071, and U.S. Pat. No. 5,641,670. Briefly, specificpolynucleotide sequences corresponding to the lipase polynucleotides orsequences proximal or distal to a lipase gene are allowed to integrateinto a host cell genome by homologous recombination where expression ofthe gene can be affected. In one embodiment, regulatory sequences areintroduced that either increase or decrease expression of an endogenoussequence. Accordingly, a lipase can be produced in a cell not normallyproducing it. Alternatively, increased expression of lipase can beeffected in a cell normally producing the protein at a specific level.Further, expression can be decreased or eliminated by introducing aspecific regulatory sequence. The regulatory sequence can beheterologous to the lipase protein sequence or can be a homologoussequence with a desired mutation that affects expression. Alternatively,the entire gene can be deleted. The regulatory sequence can be specificto the host cell or capable of functioning in more than one cell type.Still further, specific mutations can be introduced into any desiredregion of the gene to produce mutant lipase proteins. Such mutationscould be introduced, for example, into specific functional regions suchas the triglyceride or phospholipid-binding site.

In one embodiment, the host cell can be a fertilized oocyte or embryonicstem cell that can be used to produce a transgenic animal containing thealtered lipase gene. Alternatively, the host cell can be a stem cell orother early tissue precursor that gives rise to a specific subset ofcells and can be used to produce transgenic tissues in an animal. Seealso Thomas et al., Cell 51:503 (1987) for a description of homologousrecombination vectors. The vector is introduced into an embryonic stemcell line (e.g., by electroporation) and cells in which the introducedgene has homologously recombined with the endogenous lipase gene isselected (see e.g., Li, E. et al. (1992) Cell 69:915). The selectedcells are then injected into a blastocyst of an animal (e.g., a mouse)to form aggregation chimeras (see e.g., Bradley, A. in Teratocarcinomasand Embryonic Stem Cells: A Practical Approach, E. J. Robertson, ed.(IRL, Oxford, 1987) pp. 113-152). A chimeric embryo can then beimplanted into a suitable pseudopregnant female foster animal and theembryo brought to term. Progeny harboring the homologously recombinedDNA in their germ cells can be used to breed animals in which all cellsof the animal contain the homologously recombined DNA by germlinetransmission of the transgene. Methods for constructing homologousrecombination vectors and homologous recombinant animals are describedfurther in Bradley, A. (1991) Current Opinion in Biotechnology 2:823-829and in PCT International Publication Nos. WO 90/11354; WO 91/01140; andWO 93/04169.

The genetically engineered host cells can be used to produce non-humantransgenic animals. A transgenic animal is preferably a mammal, forexample a rodent, such as a rat or mouse, in which one or more of thecells of the animal include a transgene. A transgene is exogenous DNAwhich is integrated into the genome of a cell from which a transgenicanimal develops and which remains in the genome of the mature animal inone or more cell types or tissues of the transgenic animal. Theseanimals are useful for studying the function of a lipase protein andidentifying and evaluating modulators of lipase protein activity.

Other examples of transgenic animals include non-human primates, sheep,dogs, cows, goats, chickens, and amphibians.

In one embodiment, a host cell is a fertilized oocyte or an embryonicstem cell into which a lipase polynucleotide sequences have beenintroduced.

A transgenic animal can be produced by introducing nucleic acid into themale pronuclei of a fertilized oocyte, e.g., by microinjection,retroviral infection, and allowing the oocyte to develop in apseudopregnant female foster animal. Any of the lipase nucleotidesequences can be introduced as a transgene into the genome of anon-human animal, such as a mouse.

Any of the regulatory or other sequences useful in expression vectorscan form part of the transgenic sequence. This includes intronicsequences and polyadenylation signals, if not already included. Atissue-specific regulatory sequence(s) can be operably linked to thetransgene to direct expression of the lipase protein to particularcells.

Methods for generating transgenic animals via embryo manipulation andmicroinjection, particularly animals such as mice, have becomeconventional in the art and are described, for example, in U.S. Pat.Nos. 4,736,866 and 4,870,009, both by Leder et al., U.S. Pat. No.4,873,191 by Wagner et al. and in Hogan, B., Manipulating the MouseEmbryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.,1986). Similar methods are used for production of other transgenicanimals. A transgenic founder animal can be identified based upon thepresence of the transgene in its genome and/or expression of transgenicmRNA in tissues or cells of the animals. A transgenic founder animal canthen be used to breed additional animals carrying the transgene.Moreover, transgenic animals carrying a transgene can further be bred toother transgenic animals carrying other transgenes. A transgenic animalalso includes animals in which the entire animal or tissues in theanimal have been produced using the homologously recombinant host cellsdescribed herein.

In another embodiment, transgenic non-human animals can be producedwhich contain selected systems, which allow for regulated expression ofthe transgene. One example of such a system is the cre/loxP recombinasesystem of bacteriophage P1. For a description of the cre/loxPrecombinase system, see, e.g., Lakso et al. (1992) PNAS 89:6232-6236.Another example of a recombinase system is the FLP recombinase system ofS. cerevisiae (O'Gorman et al. (1991) Science 251:1351-1355. If acre/loxP recombinase system is used to regulate expression of thetransgene, animals containing transgenes encoding both the Crerecombinase and a selected protein is required. Such animals can beprovided through the construction of “double” transgenic animals, e.g.,by mating two transgenic animals, one containing a transgene encoding aselected protein and the other containing a transgene encoding arecombinase.

Clones of the non-human transgenic animals described herein can also beproduced according to the methods described in Wilmut et al. (1997)Nature 385:810-813 and PCT International Publication Nos. WO 97/07668and WO 97/07669. In brief, a cell, e.g., a somatic cell, from thetransgenic animal can be isolated and induced to exit the growth cycleand enter G₀ phase. The quiescent cell can then be fused, e.g., throughthe use of electrical pulses, to an enucleated oocyte from an animal ofthe same species from which the quiescent cell is isolated. Thereconstructed oocyte is then cultured such that it develops to morula orblastocyst and then transferred to a pseudopregnant female fosteranimal. The offspring born of this female foster animal will be a cloneof the animal from which the cell, e.g., the somatic cell, is isolated.

Transgenic animals containing recombinant cells that express thepolypeptides described herein are useful to conduct the assays describedherein in an in vivo context. Accordingly, the various physiologicalfactors that are present in vivo and that could affect, for example,binding, activation, and protein turnover, may not be evident from invitro cell-free or cell-based assays. Accordingly, it is useful toprovide non-human transgenic animals to assay in vivo lipase function,including substrate interaction, the effect of specific mutant on lipasefunction and substrate interaction, and the effect of chimeric lipases.It is also possible to assess the effect of null mutations, that ismutations that substantially or completely eliminate one or more lipasefunctions.

In general, methods for producing transgenic animals include introducinga nucleic acid sequence according to the present invention, the nucleicacid sequence capable of expressing the lipase in a transgenic animal,into a cell in culture or in vivo. When introduced in vivo, the nucleicacid is introduced into an intact organism such that one or more celltypes and, accordingly, one or more tissue types, express the nucleicacid encoding the lipase. Alternatively, the nucleic acid can beintroduced into virtually all cells in an organism by transfecting acell in culture, such as an embryonic stem cell, as described herein forthe production of transgenic animals, and this cell can be used toproduce an entire transgenic organism. As described, in a furtherembodiment, the host cell can be a fertilized oocyte. Such cells arethen allowed to develop in a female foster animal to produce thetransgenic organism.

Pharmaceutical Compositions

The lipase nucleic acid molecules, protein modulators of the protein,and antibodies (also referred to herein as “active compounds”) can beincorporated into pharmaceutical compositions suitable foradministration to a subject, e.g., a human. Such compositions typicallycomprise the nucleic acid molecule, protein, modulator, or antibody anda pharmaceutically acceptable carrier.

The term “administer” is used in its broadest sense and includes anymethod of introducing the compositions of the present invention into asubject. This includes producing polypeptides or polynucleotides in vivoas by transcription or translation, in vivo, of polynucleotides thathave been exogenously introduced into a subject. Thus, polypeptides ornucleic acids produced in the subject from the exogenous compositionsare encompassed in the term “administer.”

As used herein the language “pharmaceutically acceptable carrier” isintended to include any and all solvents, dispersion media, coatings,antibacterial and antifungal agents, isotonic and absorption delayingagents, and the like, compatible with pharmaceutical administration. Theuse of such media and agents for pharmaceutically active substances iswell known in the art. Except insofar as any conventional media or agentis incompatible with the active compound, such media can be used in thecompositions of the invention. Supplementary active compounds can alsobe incorporated into the compositions. A pharmaceutical composition ofthe invention is formulated to be compatible with its intended route ofadministration. Examples of routes of administration include parenteral,e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation),transdermal (topical), transmucosal, and rectal administration.Solutions or suspensions used for parenteral, intradermal, orsubcutaneous application can include the following components: a sterilediluent such as water for injection, saline solution, fixed oils,polyethylene glycols, glycerine, propylene glycol or other syntheticsolvents; antibacterial agents such as benzyl alcohol or methylparabens; antioxidants such as ascorbic acid or sodium bisulfite;chelating agents such as ethylenediaminetetraacetic acid; buffers suchas acetates, citrates or phosphates and agents for the adjustment oftonicity such as sodium chloride or dextrose. pH can be adjusted withacids or bases, such as hydrochloric acid or sodium hydroxide. Theparenteral preparation can be enclosed in ampules, disposable syringesor multiple dose vials made of glass or plastic.

Pharmaceutical compositions suitable for injectable use include sterileaqueous solutions (where water soluble) or dispersions and sterilepowders for the extemporaneous preparation of sterile injectablesolutions or dispersion. For intravenous administration, suitablecarriers include physiological saline, bacteriostatic water, CremophorEL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In allcases, the composition must be sterile and should be fluid to the extentthat easy syringability exists. It must be stable under the conditionsof manufacture and storage and must be preserved against thecontaminating action of microorganisms such as bacteria and fungi. Thecarrier can be a solvent or dispersion medium containing, for example,water, ethanol, polyol (for example, glycerol, propylene glycol, andliquid polyethylene glycol, and the like), and suitable mixturesthereof. The proper fluidity can be maintained, for example, by the useof a coating such as lecithin, by the maintenance of the requiredparticle size in the case of dispersion and by the use of surfactants.Prevention of the action of microorganisms can be achieved by variousantibacterial and antifungal agents, for example, parabens,chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In manycases, it will be preferable to include isotonic agents, for example,sugars, polyalcohols such as mannitol, sorbitol, sodium chloride in thecomposition. Prolonged absorption of the injectable compositions can bebrought about by including in the composition an agent which delaysabsorption, for example, aluminum monostearate and gelatin.

Sterile injectable solutions can be prepared by incorporating the activecompound (e.g., a lipase protein or anti-lipase antibody) in therequired amount in an appropriate solvent with one or a combination ofingredients enumerated above, as required, followed by filteredsterilization. Generally, dispersions are prepared by incorporating theactive compound into a sterile vehicle which contains a basic dispersionmedium and the required other ingredients from those enumerated above.In the case of sterile powders for the preparation of sterile injectablesolutions, the preferred methods of preparation are vacuum drying andfreeze-drying which yields a powder of the active ingredient plus anyadditional desired ingredient from a previously sterile-filteredsolution thereof.

Oral compositions generally include an inert diluent or an ediblecarrier. They can be enclosed in gelatin capsules or compressed intotablets. For oral administration, the agent can be contained in entericforms to survive the stomach or further coated or mixed to be releasedin a particular region of the GI tract by known methods. For the purposeof oral therapeutic administration, the active compound can beincorporated with excipients and used in the form of tablets, troches,or capsules. Oral compositions can also be prepared using a fluidcarrier for use as a mouthwash, wherein the compound in the fluidcarrier is applied orally and swished and expectorated or swallowed.Pharmaceutically compatible binding agents, and/or adjuvant materialscan be included as part of the composition. The tablets, pills,capsules, troches and the like can contain any of the followingingredients, or compounds of a similar nature: a binder such asmicrocrystalline cellulose, gum tragacanth or gelatin; an excipient suchas starch or lactose, a disintegrating agent such as alginic acid,Primogel, or corn starch; a lubricant such as magnesium stearate orSterotes; a glidant such as colloidal silicon dioxide; a sweeteningagent such as sucrose or saccharin; or a flavoring agent such aspeppermint, methyl salicylate, or orange flavoring.

For administration by inhalation, the compounds are delivered in theform of an aerosol spray from pressured container or dispenser, whichcontains a suitable propellant, e.g., a gas such as carbon dioxide, or anebulizer.

Systemic administration can also be by transmucosal or transdermalmeans. For transmucosal or transdermal administration, penetrantsappropriate to the barrier to be permeated are used in the formulation.Such penetrants are generally known in the art, and include, forexample, for transmucosal administration, detergents, bile salts, andfusidic acid derivatives. Transmucosal administration can beaccomplished through the use of nasal sprays or suppositories. Fortransdermal administration, the active compounds are formulated intoointments, salves, gels, or creams as generally known in the art.

The compounds can also be prepared in the form of suppositories (e.g.,with conventional suppository bases such as cocoa butter and otherglycerides) or retention enemas for rectal delivery.

In one embodiment, the active compounds are prepared with carriers thatwill protect the compound against rapid elimination from the body, suchas a controlled release formulation, including implants andmicroencapsulated delivery systems. Biodegradable, biocompatiblepolymers can be used, such as ethylene vinyl acetate, polyanhydrides,polyglycolic acid, collagen, polyorthoesters, and polylactic acid.Methods for preparation of such formulations will be apparent to thoseskilled in the art. The materials can also be obtained commercially fromAlza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions(including liposomes targeted to infected cells with monoclonalantibodies to viral antigens) can also be used as pharmaceuticallyacceptable carriers. These can be prepared according to methods known tothose skilled in the art, for example, as described in U.S. Pat. No.4,522,811.

It is especially advantageous to formulate oral or parenteralcompositions in dosage unit form for ease of administration anduniformity of dosage. “Dosage unit form” as used herein refers tophysically discrete units suited as unitary dosages for the subject tobe treated; each unit containing a predetermined quantity of activecompound calculated to produce the desired therapeutic effect inassociation with the required pharmaceutical carrier. The specificationfor the dosage unit forms of the invention are dictated by and directlydependent on the unique characteristics of the active compound and theparticular therapeutic effect to be achieved, and the limitationsinherent in the art of compounding such an active compound for thetreatment of individuals.

The nucleic acid molecules of the invention can be inserted into vectorsand used as gene therapy vectors. Gene therapy vectors can be deliveredto a subject by, for example, intravenous injection, localadministration (U.S. Pat. No. 5,328,470) or by stereotactic injection(see e.g., Chen et al. (1994) PNAS 91:3054-3057). The pharmaceuticalpreparation of the gene therapy vector can include the gene therapyvector in an acceptable diluent, or can comprise a slow release matrixin which the gene delivery vehicle is imbedded. Alternatively, where thecomplete gene delivery vector can be produced intact from recombinantcells, e.g. retroviral vectors, the pharmaceutical preparation caninclude one or more cells which produce the gene delivery system.

The pharmaceutical compositions can be included in a container, pack, ordispenser together with instructions for administration.

As defined herein, a therapeutically effective amount of protein orpolypeptide (i.e., an effective dosage) ranges from about 0.001 to 30mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, morepreferably about 0.1 to 20 mg/kg body weight, and even more preferablyabout 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6mg/kg body weight.

The skilled artisan will appreciate that certain factors may influencethe dosage required to effectively treat a subject, including but notlimited to the severity of the disease or disorder, previous treatments,the general health and/or age of the subject, and other diseasespresent. Moreover, treatment of a subject with a therapeuticallyeffective amount of a protein, polypeptide, or antibody can include asingle treatment or, preferably, can include a series of treatments. Ina preferred example, a subject is treated with antibody, protein, orpolypeptide in the range of between about 0.1 to 20 mg/kg body weight,one time per week for between about 1 to 10 weeks, preferably between 2to 8 weeks, more preferably between about 3 to 7 weeks, and even morepreferably for about 4, 5, or 6 weeks. It will also be appreciated thatthe effective dosage of antibody, protein, or polypeptide used fortreatment may increase or decrease over the course of a particulartreatment. Changes in dosage may result and become apparent from theresults of diagnostic assays as described herein.

The present invention encompasses agents which modulate expression oractivity. An agent may, for example, be a small molecule. For example,such small molecules include, but are not limited to, peptides,peptidomimetics, amino acids, amino acid analogs, polynucleotides,polynucleotide analogs, nucleotides, nucleotide analogs, organic orinorganic compounds (i.e., including heteroorganic and organometalliccompounds) having a molecular weight less than about 10,000 grams permole, organic or inorganic compounds having a molecular weight less thanabout 5,000 grams per mole, organic or inorganic compounds having amolecular weight less than about 1,000 grams per mole, organic orinorganic compounds having a molecular weight less than about 500 gramsper mole, and salts, esters, and other pharmaceutically acceptable formsof such compounds.

It is understood that appropriate doses of small molecule agents dependsupon a number of factors within the purview of the ordinarily skilledphysician, veterinarian, or researcher. The dose(s) of the smallmolecule will vary, for example, depending upon the identity, size, andcondition of the subject or sample being treated, further depending uponthe route by which the composition is to be administered, if applicable,and the effect which the practitioner desires the small molecule to haveupon the nucleic acid or polypeptide of the invention. Exemplary dosesinclude milligram or microgram amounts of the small molecule perkilogram of subject or sample weight (e.g., about 1 microgram perkilogram to about 500 milligrams per kilogram, about 100 micrograms perkilogram to about 5 milligrams per kilogram, or about 1 microgram perkilogram to about 50 micrograms per kilogram. It is furthermoreunderstood that appropriate doses of a small molecule depend upon thepotency of the small molecule with respect to the expression or activityto be modulated. Such appropriate doses may be determined using theassays described herein. When one or more of these small molecules is tobe administered to an animal (e.g., a human) in order to modulateexpression or activity of a polypeptide or nucleic acid of theinvention, a physician, veterinarian, or researcher may, for example,prescribe a relatively low dose at first, subsequently increasing thedose until an appropriate response is obtained. In addition, it isunderstood that the specific dose level for any particular animalsubject will depend upon a variety of factors including the activity ofthe specific compound employed, the age, body weight, general health,gender, and diet of the subject, the time of administration, the routeof administration, the rate of excretion, any drug combination, and thedegree of expression or activity to be modulated.

This invention may be embodied in many different forms and should not beconstrued as limited to the embodiments set forth herein; rather, theseembodiments are provided so that this disclosure will fully convey theinvention to those skilled in the art. Many modifications and otherembodiments of the invention will come to mind in one skilled in the artto which this invention pertains having the benefit of the teachingspresented in the foregoing description. Although specific terms areemployed, they are used as in the art unless otherwise indicated.

CHAPTER 5 25678, a Novel Human Adenylate Cyclase BACKGROUND OF THEINVENTION

Adenylate cyclase is a membrane-bound enzyme that acts as an effectorprotein in a receptor-effector system referred to as the cAMP signaltransduction pathway. As such, it plays a key intermediate role in theconversion of extracellular signals, perceived by various receptorsfollowing binding of a particular ligand, into intracellular signalsthat, in turn, generate specific cellular responses.

A variety of hormones, neurotransmitters, and olfactants regulate thesynthesis of cAMP by adenylate cyclases. In most tissues, regulation ofcAMP synthesis is accomplished through three plasma membrane-associatedcomponents: G-protein-coupled receptors (GPCRs), which interact withregulatory hormones and neurotransmitters; heterotrimeric G proteinsthat either stimulate or inhibit the catalytic subunit of adenylatecyclase in response to interaction of ligands with appropriate GPCRs;and the catalytic entity, adenylate cyclase. Each G protein contains aguanine nucleotide-binding alpha subunit and a complex of tightlyassociated β- and γ-subunits. When a G protein is activated followingbinding of a ligand to a GPCR, GDP is released from the α-subunit inexchange for GTP. Binding of the GTP results in conformational changesthat yield dissociation of the GTP-bound α-subunit from the β-γ-subunitcomplex. The resulting macromolecular complexes regulate catalyticactivity of adenylate cyclase. Where the receptor is a stimulatoryreceptor (R_(s)), interaction with a stimulatory G-protein, termedG_(s), results in activation of the adenylate cyclase catalytic subunitby the GTP-bound form of the G_(s), α-subunit. In contrast, where thereceptor is an inhibitory receptor (R₁), interaction with an inhibitoryG-protein (one of several known G₁s) results in inhibition of theadenylate cyclase catalytic subunit by the GTP-bound form of the G_(s)α-subunit. In addition, the G-protein β-γ-subunit complex may interactwith and influence adenylate cyclase activity independent of or inparallel with the GTP-bound α-subunit, depending upon the adenylatecyclase isoform involved. See Taussig and Gilman (1995) J. Biol. Chem.6:1-4; Hardman et al., eds. (1996) Goodman and Gilman's PharmacologicalBasis of Therapeutics (McGraw-Hill Company, New York, N.Y.).

When activated, the catalytic subunit of adenylate cyclase convertsintracellular ATP into cAMP. This second messenger then activatesprotein kinases, particularly protein kinase A. Activation of thisprotein kinase causes the phosphorylation of downstream target proteinsinvolved in a number of metabolic pathways, thus initiating a signaltransduction cascade.

The extent to which adenylate cyclase converts ATP to cAMP is highlydependent on the state of phosphorylation of the various components ofthe hormone-sensitive adenylate cyclase system. For example, stimulatoryand inhibitory receptors are desensitized and down-regulated followingphosphorylation by various kinases, particularly cAMP-dependent proteinkinases, protein kinase C, and other receptor-specific kinases thatpreferentially use agonist-bound forms of receptors as substrates. Inthis manner, tight regulation of the cellular cAMP concentration, andhence regulation of the cAMP signal transduction pathway, is achieved(Taussig and Gilman (1995) J. Biol. Chem. 270:1-4).

Adenylate cyclase activation may also occur through increasedintracellular calcium concentration, especially in nervous system andcardiovascular tissues. After depolarization, the influx of calciumelicits the activation of calmodulin, an intracellular calcium-bindingprotein. In the cardiovascular system, this effect gives rise to thecontraction of the blood vessels or cardiac myocytes. The activatedcalmodulin has been shown to bind and activate some isoforms ofadenylate cyclase.

Several novel isoforms of mammalian adenylate cyclase have beenidentified through molecular cloning. Type I adenylate cyclase (CYA1) isprimarily localized in brain tissues (see Krupinski et al (1989) Science244:1558-1564; Gilman (1987) Ann. Rev. Biochem. 56:615-649, citingSalter et al. (1981) J. Biol. Chem. 256:9830-9833; Andreasen et al.(1983) Biochemistry 22:2757-2762; and Smigel et al (1986) J. Biol. Chem.261:1976-1982 for bovine CYA1; and Villacres et al. (1993) Genomics16:473-478 for human CYA1). The type II adenylate cyclase (CYA2) islocalized in brain and lung tissues (see Feinstein et al. (1991) Proc.Natl. Acad. Sci. USA 88:10173-10177 for rat CYA2; and Stengel et al.(1992) Hum. Genet. 90:126-130 for human CYA2). Type III adenylatecyclase (CYA3) is primarily localized in olfactory neuroepithelium andis thought to mediate olfactory receptor responses (Bakalyar and Reed(1990) Science 250:1403-1406; Glatt and Snyder (1993) Nature361:536-538; and Xia (1992) Neurosci. Lett. 144:169-173). Type IVadenylate cyclase (CYA4) most resembles type II, but is expressed in avariety of peripheral tissues and in the central nervous system (Gao andGilman (1991) Proc. Natl. Acad. Sci. USA 88:10178-10182, for rat CYA4).Type V adenylate cyclase (CYA5) (Ishikawa et al. (1992) J. Biol. Chem.267:13553-13557; Premont et al. (1992) Proc. Natl. Acad. Sci. USA89:9809-9813; and Glatt and Snyder (1993) Nature 361:536-538; Krupinskiet al. (1992) J. Biol. Chem. 267:24858-24862) and type VI adenylatecyclase (CYA6) (Premont et al. (1992) Proc. Natl. Acad. Sci. USA89:9808-9813; Yoshimura and Cooper (1992) Proc. Natl. Acad. Sci. USA89:6716-6720; Katsushika et al. (1992) Proc. Natl. Acad. Sci. USA89:8774-8778; and Krupinski et al. (1992) J. Biol. Chem.267:24858-24862) both exhibit a widely distributed expression pattern,with type V having high expression in heart and striatum, and type VIhaving high expression in heart and brain. Type VII adenylate cyclase(CYA7) is widely distributed, though may be absent from brain tissues(Krupinski et al (1992) J. Biol. Chem. 267:24858-24862). Type VIIIadenylate cyclase (CYA8) is abundant in brain tissues (Krupinski et al.(1992) J. Biol. Chem. 267:24858-24862; and Parma et al. (1991) Biochem.Biophys. Res. Commun. 179:455-462). Type IX adenylate cyclase (CYA9) iswidely expressed, at high levels in skeletal muscle and brain (Premontet al. (1996) J. Biol. Chem. 271:13900-13907).

The different isoforms of adenylate cyclase exhibit unique patterns ofregulatory responses (see Sunahara et al. (1996) Annu. Tev. Pharmacol.Toxicol 36:461-480). For example, all of these isoforms are activated bythe α-subunit of a particular G protein, termed Gs, which couples thestimulatory action of the ligand-bound receptor to activation ofadenylate cyclase. The adenylate cyclases designated type I, III, andVIII are also stimulated by Ca²⁺/calmodulin in vitro, while type II, IV,V, VI, VII, and IX are not. Type I is inhibited by G protein β-γ-subunitcomplex, independently of G_(s) activation, while Type II is highlystimulated by G protein β-γ-subunit complex when simultaneouslyactivated by Gs alpha subunit. Type III, in contrast, is not affected byG protein β-γ-subunit complex. Type V and type VI are both are inhibitedby low levels of Ca²⁺, but appear to be unaffected by G proteinβ-γ-subunit complex. Type IX is unique in that it is stimulated by Mg²⁺,but is not affected by G protein β-γ-subunit complex.

The genes for these adenylate cyclases all encode proteins havingmolecular weights of approximately 120,000 and which range from 1064 to1353 amino acid residues. These proteins are predicted to have a shortcytoplasmic amino terminus followed by a first motif consisting of sixtransmembrane spans and a cytoplasmic (domain C₁), and then a secondmotif, also consisting of six transmembrane spans and a secondcytoplasmic domain (domain C₂). The two cytoplasmic domains areapproximately 40 kDa each and contain a region of homology (designatedC_(1a) and C_(2a)) with each other and with the catalytic domains ofmembrane-bound guanylate cyclases. Based on this similarity, thesedomains are considered to be nucleotide binding domains, and togetherhave been shown to be sufficient to confer enzymatic activity (Tang andGilman (1995) Science 268:1769-1772).

Alterations in the cAMP signal transduction pathway have been associatedwith diseases such as asthma, cancer, inflammation, hypertension,atherosclerosis, and heart failure. Antihypertensive drug therapyinvolves modulation of adenylate cyclase levels (Marcil et al. (1996)Hypertension 28:83-90). In addition, studies of heart in human andanimal models indicate that adenylate cyclase has a function incardiomyopathy (Michael et al. (1995) Hypertension 25:962-970, Roth etal (1999) Circulation 99:3099-3102), ischemia (Sandhu et al. (1996)Circulation Research 78:137-147), myocardial infarction (Espinasse etal. (1999) Cardiovascular Research 42:87-98) and congestive heartfailure (Kawahira et al. (1998) Circulation 98:262-267, Panza et al.(1995) Circulation 91:1732-1738). The enzyme is also related to somemental disorders. Studies of learning and memory in animal modelsindicate a likely role for calmodulin-activated adenylate cyclases inconditioning (Abrams and Kandel (1988) Trends Neurosci. 11:128-135),learning (Livingstone et al. (1984) Cell 37:205-215), and long-termpotentiation (Frey et al. (1993) Science 260:161-1664). Furthermore, thecAMP signaling pathway plays an important role in cardiovascularphysiology. For instance, cAMP activates protein kinase A (PKA). Theactivated subunits of PKA initiate a series of enzymatic reactions thatultimately activate multiple proteins that regulate both the rate andforce of cardiac contraction.

Accordingly, adenylate cyclases are a major target for drug action anddevelopment. Accordingly, it is valuable to the field of pharmaceuticaldevelopment to identify and characterize novel adenylate cyclases andtissues and disorders in which adenylate cyclases are differentiallyexpressed. The present invention advances the state of the art byproviding a novel human adenylate cyclase and tissues and disorders inwhich expression of a human adenylate cyclase is relevant. Accordingly,the invention provides methods directed to expression of the adenylatecyclase.

SUMMARY OF THE INVENTION

It is an object of the invention to identify novel adenylate cyclasesand tissues and disorders in which expression of the adenylate cyclaseis relevant.

It is a further object of the invention to provide novel adenylatecyclase polypeptides that are useful as reagents or targets in adenylatecyclase assays applicable to treatment and diagnosis of disordersmediated by or related to the adenylate cyclase.

It is a further object of the invention to provide polynucleotidescorresponding to the adenylate cyclase polypeptides that are useful astargets or reagents in adenylate cyclase assays applicable to treatmentand diagnosis of disorders mediated by or related to the adenylatecyclase and useful for producing novel adenylate cyclase polypeptides byrecombinant methods.

A specific object of the invention is to identify compounds that act asagonists and antagonists and modulate the expression of the adenylatecyclase in specific tissues and disorders.

A further specific object of the invention is to provide compounds thatmodulate expression of the adenylate cyclase for treatment and diagnosisof adenylate cyclase-mediated or related disorders.

The invention is thus based on the identification and expression of ahuman adenylate cyclase, especially in specific tissues and disorders.

The invention provides methods of screening for compounds that modulateexpression or activity of the adenylate cyclase polypeptides or nucleicacid (RNA or DNA) in the specific tissues or disorders.

The invention also provides a process for modulating adenylate cyclasepolypeptide or nucleic acid expression or activity, especially using thescreened compounds.

Modulation may be used to treat conditions related to aberrant activityor expression of the adenylate cyclase polypeptides or nucleic acids.

The invention also provides assays for determining the activity of orthe presence or absence of the adenylate cyclase polypeptides or nucleicacid molecules in specific biological samples, including for diseasediagnosis.

The invention also provides assays for determining the presence of amutation in the polypeptides or nucleic acid molecules, including fordisease diagnosis.

The invention provides isolated adenylate cyclase polypeptides,including a polypeptide having the amino acid sequence shown in SEQ IDNO:19 or the amino acid sequence encoded by the cDNA insert of theplasmid deposited as ATCC Patent Deposit PTA-1871 on May 12, 2000 (“thedeposited cDNA”).

The invention also provides an isolated adenylate cyclase nucleic acidmolecule having the sequence shown in SEQ ID NO:20 or encoded by thedeposited cDNA.

The invention also provides variant polypeptides having an amino acidsequence that is substantially homologous to the amino acid sequenceshown in SEQ ID NO:19 or encoded by the deposited cDNA.

The invention also provides variant nucleic acid sequences that aresubstantially homologous to the nucleotide sequence shown in SEQ IDNO:20 or in the deposited cDNA.

The invention also provides fragments of the polypeptide shown in SEQ IDNO:19 and nucleotide sequence shown in SEQ ID NO:20, as well assubstantially homologous fragments of the polypeptide or nucleic acid.

The invention further provides nucleic acid constructs comprising thenucleic acid molecules described herein. In a preferred embodiment, thenucleic acid molecules of the invention are operatively linked to aregulatory sequence.

The invention also provides vectors and host cells that express theadenylate cyclase and provides methods for expressing the adenylatecyclase nucleic acid molecules and polypeptides in specific cell typesand disorders, and particularly recombinant vectors and host cells.

The invention also provides methods of making the vectors and host cellsand provides methods for using them to produce adenylate cyclase nucleicacid molecules and polypeptides and to assay expression and cellulareffects of expression of the adenylate cyclase nucleic acid moleculesand polypeptides in specific cell types and disorders.

The invention also provides antibodies or antigen-binding fragmentsthereof that selectively bind the adenylate cyclase polypeptides andfragments.

In still a further embodiment, the invention provides a computerreadable means containing the nucleotide and/or amino acid sequences ofthe nucleic acids and polypeptides of the invention.

DETAILED DESCRIPTION OF THE INVENTION

The present inventions now will be described more fully hereinafter withreference to the accompanying drawings, in which some, but not allembodiments of the invention are shown. Indeed, these inventions may beembodied in many different forms and should not be construed as limitedto the embodiments set forth herein; rather, these embodiments areprovided so that this disclosure will satisfy applicable legalrequirements. Like numbers refer to like elements throughout.

Many modifications and other embodiments of the inventions set forthherein will come to mind to one skilled in the art to which theseinventions pertain having the benefit of the teachings presented in theforegoing descriptions and the associated drawings. Therefore, it is tobe understood that the inventions are not to be limited to the specificembodiments disclosed and that modifications and other embodiments areintended to be included within the scope of the appended claims.Although specific terms are employed herein, they are used in a genericand descriptive sense only and not for purposes of limitation.

The present invention is based, at least in part, on the identificationof novel molecules, referred to herein as adenylate cyclase nucleic acidand polypeptide molecules, which play a key role in regulation of thecyclic AMP (cAMP) signal transduction pathway by virtue of theirconversion of intracellular ATP into cAMP. In one embodiment, theadenylate cyclase molecules modulate the activity of one or moreproteins involved in cellular metabolism associated with cellmaintenance, growth, or differentiation, e.g., cardiac, epithelial, orneuronal cell maintenance, growth, or differentiation. In anotherembodiment, the adenylate cyclase molecules of the present invention arecapable of modulating the phosphorylation state of one or more proteinsinvolved in cellular metabolism associated with cell maintenance,growth, or differentiation, e.g., cardiac, epithelial, or neuronal cellmaintenance, growth or differentiation, via their indirect effect oncAMP-dependent protein kinases, particularly protein kinase A, asdescribed in, for example, Devlin (1997) Textbook of Biochemistry withClinical Correlations (Wiley-Liss, Inc., New York, N.Y.). In addition,the receptors which trigger activity of the adenylate cyclases of thepresent invention are targets of drugs as described in Goodman andGilman (1996), The Pharmacological Basis of Therapeutics (9^(th) ed.)Hartman & Limbard Editors, the contents of which are incorporated hereinby reference.

As used herein, a “signaling pathway” refers to the modulation (e.g.,stimulation or inhibition) of a cellular function/activity upon thebinding of a ligand to a receptor. Examples of such functions includemobilization of intracellular molecules that participate in a signaltransduction pathway, e.g., phosphatidylinositol 4,5-bisphosphate(PIP₂), inositol 1,4,5-triphosphate (IP3) and adenylate cyclase;polarization of the plasma membrane; production or secretion ofmolecules; alteration in the structure of a cellular component; cellproliferation, e.g., synthesis of DNA; cell migration; celldifferentiation; and cell survival.

The response depends on the type of cell. In some cells, binding of aligand to the receptor may stimulate an activity such as release ofcompounds, gating of a channel, cellular adhesion, migration,differentiation, etc., through phosphatidylinositol or cyclic AMPmetabolism and turnover while in other cells, binding will produce adifferent result.

The cAMP turnover pathway is a signaling pathway. As used herein,“cyclic AMP turnover and metabolism” refers to the molecules involved inthe turnover and metabolism of cAMP as well as to the activities ofthese molecules. Cyclic AMP is a second messenger produced in responseto ligand-induced stimulation of certain receptors. In the cAMPsignaling pathway, binding of a ligand can lead to the activation of theenzyme adenyl cyclase, which catalyzes the synthesis of cAMP. The newlysynthesized cAMP can in turn activate a cAMP-dependent protein kinase.This activated kinase can phosphorylate a voltage-gated potassiumchannel protein, or an associated protein, and lead to the inability ofthe potassium channel to open during an action potential. The inabilityof the potassium channel to open results in a decrease in the outwardflow of potassium, which normally repolarizes the membrane of a neuron,leading to prolonged membrane depolarization.

The cGMP turnover pathway is also a signaling pathway. As used herein,“cyclic GMP turnover and metabolism” refers to the molecules involved inthe turnover and metabolism of cGMP as well as to the activities ofthese molecules. Cyclic GMP is a second messenger produced in responseto ligand-induced stimulation of certain receptors. In the cGMPsignaling pathway, binding of a ligand can lead to the activation of theenzyme guanyl cyclase, which catalyzes the synthesis of cGMP.Synthesized cGMP can in turn activate a cGMP-dependent protein kinase.

The invention is directed to methods, uses and reagents applicable tomethods and uses that are applied to cells, tissues and disorders ofthese cells and tissues wherein adenylate cyclase expression isrelevant. The adenylate cyclase is expressed in a variety of tissues asshown in FIGS. 50 and 51. Accordingly, the methods and uses of theinvention as disclosed in greater detail below apply to these tissues,disorders involving these tissues, and particularly to the disorderswith which gene expression is associated, as shown in these figures andas disclosed herein. Accordingly, the methods, uses and reagentsdisclosed in greater detail below especially apply to prostate, skeletalmuscle, brain, and testis. In addition, low positive expression is alsoobserved in aorta with lower relative expression in the aorta withintimal proliferations, and internal mammary artery. In addition, usingheart as a reference, low positive expression is seen in ischemic andmyopathic hearts. Accordingly, the uses, reagents and methods disclosedin detail herein below apply especially to these tissues, cell types,and disorders.

Methods Using the Polypeptides

The protein sequences of the present invention can be used as a “querysequence” to perform a search against public databases to, for example,identify other family members or related sequences. Such searches can beperformed using the NBLAST and XBLAST programs (version 2.0) of Altschulet al. (1990) J. Mol. Biol. 215:403-10. BLAST nucleotide searches can beperformed with the NBLAST program, score=100, wordlength=12 to obtainnucleotide sequences homologous to the nucleic acid molecules of theinvention. BLAST protein searches can be performed with the XBLASTprogram, score=50, wordlength=3 to obtain amino acid sequenceshomologous to the proteins of the invention. To obtain gapped alignmentsfor comparison purposes, Gapped BLAST can be utilized as described inAltschul et al. (1997) Nucleic Acids Res. 25(17):3889-3402. Whenutilizing BLAST and Gapped BLAST programs, the default parameters of therespective programs (e.g., XBLAST and NBLAST) can be used. Seewww.ncbi.nlm.nih.gov.

The adenylate cyclase polypeptides are useful for producing antibodiesspecific for the adenylate cyclase, regions, or fragments. Regionshaving a high antigenicity index score are shown in FIG. 47.

The invention provides methods using the adenylate cyclase, variants, orfragments, including but not limited to use in the cells, tissues, anddisorders as disclosed herein.

The invention provides biological assays related to adenylate cyclases.Such assays involve any of the known functions or activities orproperties useful for diagnosis and treatment of cyclic adenylatecyclase-related conditions. These include, but are not limited to,binding and/or activation by G-protein subunits, alpha, beta or gamma,hydrolysis of ATP or GTP and consequent modulation of cAMP and/or cGMPintracellular concentration, ability to be bound by specific antibody,GTP or ATP binding, and protein kinase A phosphorylation, as well as thevarious other properties and functions disclosed herein and disclosed inthe references cited herein.

The invention provides drug screening assays, in cell-based or cell-freesystems. Cell-based systems can be native, i.e., cells that normallyexpress the adenylate cyclase, as a biopsy, or expanded in cell culture.In one embodiment, cell-based assays involve recombinant host cellsexpressing the adenylate cyclase. Accordingly, cells that are useful inthis regard include, but are not limited to, those disclosed herein asexpressing or differentially expressing the adenylate cyclase, such asthose shown in FIGS. 50 and 51. These include, but are not limited to,cells or tissues derived from prostate, skeletal muscle, brain, colon,ovary, aorta, testis, placenta, fetal heart, aorta with intimalproliferations, internal mammary artery, kidney, and saphenous vein.Such cells can naturally express the gene or can be recombinant,containing one or more copies of exogenously-introduced adenylatecyclase sequences or genetically modified to modulate expression of theendogenous adenylate cyclase sequence.

This aspect of the invention particularly relates to cells derived fromsubjects with disorders involving the tissues in which the adenylatecyclase is expressed or derived from tissues subject to disordersincluding, but not limited to, those disclosed herein. These disordersmay naturally occur, as in populations of human subjects, or may occurin model systems such as in vitro systems or in vivo, such as innon-human transgenic organisms, particularly in non-human transgenicanimals.

Such assays can involve the identification of agents that interact withthe adenylate cyclase protein. This interaction can be detected byfunctional assays, such as the ability to be affected by an effectormolecule, such as binding and/or activation by G-protein subunits orhydrolysis of ATP and/or GTP to modulate intracellular cAMP/cGMPconcentrations. Such interaction can also be measured by ultimatebiological effects, such as phosphorylation of protein kinases, forexample protein kinase A, and other downstream effectors in the signaltransduction pathway, having biological effects on immunity/inflammationor cell proliferation, i.e., any of the effects of modulating theintracellular levels of the second messengers cAMP and cGMP.

Determining the ability of the test compound to interact with theadenylate cyclase can also comprise determining the ability of the testcompound to preferentially bind to the polypeptide as compared to theability of a known binding molecule (e.g., G-protein, calmodulin, GTP orATP) to bind to the polypeptide.

In yet another aspect of the invention, the invention provides methodsto identify proteins that interact with the adenylate cyclase in thetissues and disorders disclosed. The proteins of the invention can beused as “bait proteins” in a two-hybrid assay or three-hybrid assay(see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartelet al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene8:1693-1696; and Brent WO 94/10300), to identify other proteins(captured proteins) which bind to or interact with the proteins of theinvention and modulate their activity.

The invention provides methods to identify compounds that modulateadenylate cyclase activity. Such compounds, for example, can increase ordecrease affinity or rate of binding to GTP or ATP, compete with GTP orATP for binding to the adenylate cyclase, or displace GTP or ATP boundto the adenylate cyclase. Such compounds can also increase or decreaseaffinity or rate of binding to calmodulin, compete with calmodulin forbinding to the adenyl cyclase, or displace calmodulin bound to theadenyl cyclase. Such compounds can also, for example, increase ordecrease the affinity or rate of binding of one or more G-proteinsubunits, compete with the subunits for binding, or displace thesubunits bound to the adenyl cyclase. Both adenylate cyclase andappropriate variants and fragments can be used in high-throughputscreens to assay candidate compounds for the ability to bind to theadenylate cyclase. These compounds can be further screened against afunctional adenylate cyclase to determine the effect of the compound onthe adenylate cyclase activity. Compounds can be identified thatactivate (agonist) or inactivate (antagonist) the adenylate cyclase to adesired degree. Modulatory methods can be performed in vitro (e.g., byculturing the cell with the agent) or, alternatively, in vivo (e.g., byadministering the agent to a subject. The subject can be a humansubject, for example, a subject in a clinical trial or undergoingtreatment or diagnosis, or a non-human transgenic subject, such as atransgenic animal model for disease.

The invention provides methods to screen a compound for the ability tostimulate or inhibit interaction between the adenylate cyclase proteinand a target molecule that normally interacts with the adenylate cyclaseprotein. The target can be an ATP or GTP, or another component of thesignal pathway with which the adenylate cyclase protein normallyinteracts, including but not limited to, calmodulin, or a G-proteinsubunit (one or more of alpha, beta, or gamma). The assay includes thesteps of combining the adenylate cyclase protein with a candidatecompound under conditions that allow the adenylate cyclase protein orfragment to interact with the target molecule, and to detect theformation of a complex between the adenylate cyclase protein and thetarget, or to detect the biochemical consequence of the interaction withthe adenylate cyclase and the target, such as any of the associatedeffects of signal transduction such as protein kinase A phosphorylation,cAMP or cGMP turnover, and biological endpoints of the pathway.

Determining the ability of the adenylate cyclase to bind to a targetmolecule can also be accomplished using a technology such as real-timeBimolecular Interaction Analysis (BIA). Sjolander et al. (1991) Anal.Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct. Biol.5:699-705. As used herein, “BIA” is a technology for studyingbiospecific interactions in real time, without labeling any of theinteractants (e.g., BIAcore™). Changes in the optical phenomenon surfaceplasmon resonance (SPR) can be used as an indication of real-timereactions between biological molecules.

The test compounds of the present invention can be obtained using any ofthe numerous approaches in combinatorial library methods known in theart, including: biological libraries; spatially addressable parallelsolid phase or solution phase libraries; synthetic library methodsrequiring deconvolution; the ‘one-bead one-compound’ library method; andsynthetic library methods using affinity chromatography selection. Thebiological library approach is limited to polypeptide libraries, whilethe other four approaches are applicable to polypeptide, non-peptideoligomer or small molecule libraries of compounds (Lam, K. S. (1997)Anticancer Drug Des. 12:145).

Examples of methods for the synthesis of molecular libraries can befound in the art, for example in DeWitt et al. (1993) Proc. Natl. Acad.Sci. USA 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422;Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993)Science 261:1303; Carell et al. (1994) Angew. Chem. Int. Ed. Engl.33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; andin Gallop et al. (1994) J. Med. Chem. 37:1233. Libraries of compoundsmay be presented in solution (e.g., Houghten (1992) Biotechniques13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor(1993) Nature 364:555-556), bacteria (Ladner U.S. Pat. No. 5,223,409),spores (Ladner U.S. Pat. No. '409), plasmids (Cull et al. (1992) Proc.Natl. Acad. Sci. USA 89:1865-1869) or on phage (Scott and Smith (1990)Science 249:386-390); (Devlin (1990) Science 249:404-406); (Cwirla etal. (1990) Proc. Natl. Acad. Sci. 97:6378-6382); (Felici (1991) J. Mol.Biol. 222:301-310); (Ladner supra).

Candidate compounds include, for example, 1) peptides such as solublepeptides, including Ig-tailed fusion peptides and members of randompeptide libraries (see, e.g., Lam et al. (1991) Nature 354:82-84;Houghten et al. (1991) Nature 354:84-86) and combinatorialchemistry-derived molecular libraries made of D- and/or L-configurationamino acids; 2) phosphopeptides (e.g., members of random and partiallydegenerate, directed phosphopeptide libraries, see, e.g., Songyang etal. (1993) Cell 72:767-778); 3) antibodies (e.g., polyclonal,monoclonal, humanized, anti-idiotypic, chimeric, and single chainantibodies as well as Fab, F(ab′)₂, Fab expression library fragments,and epitope-binding fragments of antibodies); and 4) small organic andinorganic molecules (e.g., molecules obtained from combinatorial andnatural product libraries).

One candidate compound is a soluble full-length adenylate cyclase orfragment that competes for GTP or ATP binding. Other candidate compoundsinclude mutant adenylate cyclases or appropriate fragments containingmutations that affect adenylate cyclase function and thus compete forGTP or ATP. Accordingly, a fragment that competes for ATP or GTP, forexample with a higher affinity, or a fragment that binds ATP or GTP butdoes not cyclize it, is encompassed by the invention. Other fragmentsthat are encompassed include, but are not limited to, those that willbind but not be activated by G-protein subunits, or bind but not beactivated by calmodulin.

The invention provides other end points to identify compounds thatmodulate (stimulate or inhibit) adenylate cyclase activity. The assaystypically involve an assay of events in the signal transduction pathwaythat indicate adenylate cyclase activity. Thus, the expression of genesthat are up- or down-regulated in response to the adenylate cyclasedependent signal cascade can be assayed. In one embodiment, theregulatory region of such genes can be operably linked to a marker thatis easily detectable, such as luciferase.

Any of the biological or biochemical functions mediated by the adenylatecyclase can be used as an endpoint assay. These include all of thebiochemical or biochemical/biological events described herein, in thereferences cited herein, incorporated by reference for these endpointassay targets, and other functions known to those of ordinary skill inthe art.

In the case of the adenylate cyclase, specific end points can includeATP and GTP cyclization and a decrease or increase in intracellular cAMPor cGMP concentrations or in protein kinase A activation.

Assays for adenylate cyclase function include, but are not limited to,those that are well known in the art and available to the person ofordinary skill in the art, for example, G-protein subunit binding andactivation of adenyl cyclase such as that disclosed in Taussig et al.(1995), or Sunahara et al., herein above, effect on cAMP- orcGMP-dependent kinases, as described for example in Devlin, hereinabove, changes in intracellular cAMP and/or cGMP concentration, asdescribed in Sunahara et al., herein above, and stimulation bycalmodulin in vitro, as disclosed in Sunahara et al., herein above.Further, nucleotide triphosphate binding domains (e.g., for ATP and GTP)can be assayed according to Tang et al. (1995), herein above. All ofthese references are incorporated herein by reference for these assays.

Binding and/or activating compounds can also be screened by usingchimeric adenylate cyclase proteins in which one or more domains, sites,and the like, as disclosed herein, or parts thereof, can be replaced bytheir heterologous counterparts derived from other adenylate cyclaseisoforms of the same family or from adenylate cyclase isoforms of anyother adenylate cyclase family. For example, a catalytic region can beused that interacts with a different cyclic nucleotide specificityand/or affinity than the native adenylate cyclase. Accordingly, adifferent set of signal transduction components is available as anend-point assay for activation. Alternatively, a heterologous effectorprotein binding/activation sequence can replace the native sequence. Forexample, a different G-protein subunit can be bound or interact with themodified adenyl cyclase. Accordingly, the adenyl cyclase is subject todifferent modulation by different stimulatory or inhibitory G-proteinsubunits based on inhibitory or stimulatory receptor interaction withthe G-protein. As a further alternative, the site of modification by aneffector protein, for example phosphorylation by a protein kinase can bereplaced with the site from a different effector protein. This couldalso provide the use of a different signal transduction pathway forendpoint determination. Activation can also be detected by a reportergene containing an easily detectable coding region operably linked to atranscriptional regulatory sequence that is part of the native signaltransduction pathway.

The invention provides competition binding assays designed to discovercompounds that interact with the adenylate cyclase. Thus, a compound isexposed to a adenylate cyclase polypeptide under conditions that allowthe compound to bind or to otherwise interact with the polypeptide.Soluble adenylate cyclase polypeptide is also added to the mixture. Ifthe test compound interacts with the soluble adenylate cyclasepolypeptide, it decreases the amount of complex formed or activity fromthe adenylate cyclase target. This type of assay is particularly usefulin cases in which compounds are sought that interact with specificregions of the adenylate cyclase. Thus, the soluble polypeptide thatcompetes with the target adenylate cyclase region is designed to containpeptide sequences corresponding to the region of interest.

Another type of competition-binding assay can be used to discovercompounds that interact with specific functional sites. As an example,calmodulin or one or more G-protein subunits and a candidate compoundcan be added to a sample of the adenylate cyclase. Compounds thatinteract with the adenylate cyclase at the same site as these componentswill reduce the amount of complex formed between the adenylate cyclaseand these components. Accordingly, it is possible to discover a compoundthat specifically prevents interaction between the adenylate cyclase andthese components. Another example involves adding a candidate compoundto a sample of adenylate cyclase and ATP or GTP. A compound thatcompetes with ATP or GTP will reduce the amount of cyclization orbinding of the ATP or GTP to the adenylate cyclase. Accordingly,compounds can be discovered that directly interact with the adenylatecyclase and compete with ATP or GTP. Such assays can involve any othercomponent that interacts with the adenylate cyclase.

To perform cell-free drug screening assays, it is desirable toimmobilize either the adenylate cyclase, or fragment, or its targetmolecule to facilitate separation of complexes from uncomplexed forms ofone or both of the proteins, as well as to accommodate automation of theassay.

Techniques for immobilizing proteins on matrices can be used in the drugscreening assays. In one embodiment, a fusion protein can be providedwhich adds a domain that allows the protein to be bound to a matrix. Forexample, glutathione-S-transferase/adenylate cyclase fusion proteins canbe adsorbed onto glutathione sepharose beads (Sigma Chemical, St. Louis,Mo.) or glutathione derivatized microtitre plates, which are thencombined with the cell lysates (e.g., ³⁵S-labeled) and the candidatecompound, and the mixture incubated under conditions conducive tocomplex formation (e.g., at physiological conditions for salt and pH).Following incubation, the beads are washed to remove any unbound label,and the matrix immobilized and radiolabel determined directly, or in thesupernatant after the complexes is dissociated. Alternatively, thecomplexes can be dissociated from the matrix, separated by SDS-PAGE, andthe level of adenylate cyclase-binding protein found in the beadfraction quantitated from the gel using standard electrophoretictechniques. For example, either the polypeptide or its target moleculecan be immobilized utilizing conjugation of biotin and streptavidinusing techniques well known in the art. Alternatively, antibodiesreactive with the protein but which do not interfere with binding of theprotein to its target molecule can be derivatized to the wells of theplate, and the protein trapped in the wells by antibody conjugation.Preparations of a adenylate cyclase-binding component, such as ATP orG-protein subunit, and a candidate compound are incubated in theadenylate cyclase-presenting wells and the amount of complex trapped inthe well can be quantitated. Methods for detecting such complexes, inaddition to those described above for the GST-immobilized complexes,include immunodetection of complexes using antibodies reactive with theadenylate cyclase target molecule, or which are reactive with adenylatecyclase and compete with the target molecule; as well as enzyme-linkedassays which rely on detecting an enzymatic activity associated with thetarget molecule.

Modulators of adenylate cyclase level or activity identified accordingto these assays can be used to test the effects of modulation ofexpression of the enzyme on the outcome of clinically relevantdisorders. This can be accomplished in vitro, in vivo, such as in humanclinical trials, and in test models derived from other organisms, suchas non-human transgenic subjects. Modulation in such subjects includes,but is not limited to, modulation of the cells, tissues, and disordersparticularly disclosed herein. Modulators of adenylate cyclase activityidentified according to these drug screening assays can be used to treata subject with a disorder mediated by the adenylate cyclase pathway, bytreating cells that express the adenylate cyclase, such as thosedisclosed herein, especially in FIGS. 50 and 51, as well as thosedisorders disclosed in the references cited herein above. In oneembodiment, the cells that are treated are derived from prostate,skeletal muscle, brain, testis and aorta, and as such, modulation isparticularly relevant to disorders involving these tissues. In anotherembodiment, modulation is in aortic tissue with intimal proliferationsor in ischemic or myopathic heart tissue. Accordingly, disorders inwhich modulation is particularly relevant can include these tissues.These methods of treatment include the steps of administering themodulators of adenylate cyclase activity in a pharmaceutical compositionas described herein, to a subject in need of such treatment.

Disorders involving the colon include, but are not limited to,congenital anomalies, such as atresia and stenosis, Meckel diverticulum,congenital aganglionic megacolon-Hirschsprung disease; enterocolitis,such as diarrhea and dysentery, infectious enterocolitis, includingviral gastroenteritis, bacterial enterocolitis, necrotizingenterocolitis, antibiotic-associated colitis (pseudomembranous colitis),and collagenous and lymphocytic colitis, miscellaneous intestinalinflammatory disorders, including parasites and protozoa, acquiredimmunodeficiency syndrome, transplantation, drug-induced intestinalinjury, radiation enterocolitis, neutropenic colitis (typhlitis), anddiversion colitis; idiopathic inflammatory bowel disease, such as Crohndisease and ulcerative colitis; tumors of the colon, such asnon-neoplastic polyps, adenomas, familial syndromes, colorectalcarcinogenesis, colorectal carcinoma, and carcinoid tumors.

Disorders involving the brain include, but are not limited to, disordersinvolving neurons, and disorders involving glia, such as astrocytes,oligodendrocytes, ependymal cells, and microglia; cerebral edema, raisedintracranial pressure and herniation, and hydrocephalus; malformationsand developmental diseases, such as neural tube defects, forebrainanomalies, posterior fossa anomalies, and syringomyelia and hydromyelia;perinatal brain injury; cerebrovascular diseases, such as those relatedto hypoxia, ischemia, and infarction, including hypotension,hypoperfusion, and low-flow states—global cerebral ischemia and focalcerebral ischemia—infarction from obstruction of local blood supply,intracranial hemorrhage, including intracerebral (intraparenchymal)hemorrhage, subarachnoid hemorrhage and ruptured berry aneurysms, andvascular malformations, hypertensive cerebrovascular disease, includinglacunar infarcts, slit hemorrhages, and hypertensive encephalopathy;infections, such as acute meningitis, including acute pyogenic(bacterial) meningitis and acute aseptic (viral) meningitis, acute focalsuppurative infections, including brain abscess, subdural empyema, andextradural abscess, chronic bacterial meningoencephalitis, includingtuberculosis and mycobacterioses, neurosyphilis, and neuroborreliosis(Lyme disease), viral meningoencephalitis, including arthropod-borne(Arbo) viral encephalitis, Herpes simplex virus Type 1, Herpes simplexvirus Type 2, Varicalla-zoster virus (Herpes zoster), cytomegalovirus,poliomyelitis, rabies, and human immunodeficiency virus 1, includingHIV-1 meningoencephalitis (subacute encephalitis), vacuolar myelopathy,AIDS-associated myopathy, peripheral neuropathy, and AIDS in children,progressive multifocal leukoencephalopathy, subacute sclerosingpanencephalitis, fungal meningoencephalitis, other infectious diseasesof the nervous system; transmissible spongiform encephalopathies (priondiseases); demyelinating diseases, including multiple sclerosis,multiple sclerosis variants, acute disseminated encephalomyelitis andacute necrotizing hemorrhagic encephalomyelitis, and other diseases withdemyelination; degenerative diseases, such as degenerative diseasesaffecting the cerebral cortex, including Alzheimer disease and Pickdisease, degenerative diseases of basal ganglia and brain stem,including Parkinsonism, idiopathic Parkinson disease (paralysisagitans), progressive supranuclear palsy, corticobasal degeneration,multiple system atrophy, including striatonigral degeneration,Shy-Drager syndrome, and olivopontocerebellar atrophy, and Huntingtondisease; spinocerebellar degenerations, including spinocerebellarataxias, including Friedreich ataxia, and ataxia-telanglectasia,degenerative diseases affecting motor neurons, including amyotrophiclateral sclerosis (motor neuron disease), bulbospinal atrophy (Kennedysyndrome), and spinal muscular atrophy; inborn errors of metabolism,such as leukodystrophies, including Krabbe disease, metachromaticleukodystrophy, adrenoleukodystrophy, Pelizaeus-Merzbacher disease, andCanavan disease, mitochondrial encephalomyopathies, including Leighdisease and other mitochondrial encephalomyopathies; toxic and acquiredmetabolic diseases, including vitamin deficiencies such as thiamine(vitamin B₁) deficiency and vitamin B₁₂ deficiency, neurologic sequelaeof metabolic disturbances, including hypoglycemia, hyperglycemia, andhepatic encephatopathy, toxic disorders, including carbon monoxide,methanol, ethanol, and radiation, including combined methotrexate andradiation-induced injury; tumors, such as gliomas, includingastrocytoma, including fibrillary (diffuse) astrocytoma and glioblastomamultiforme, pilocytic astrocytoma, pleomorphic xanthoastrocytoma, andbrain stem glioma, oligodendroglioma, and ependymoma and relatedparaventricular mass lesions, neuronal tumors, poorly differentiatedneoplasms, including medulloblastoma, other parenchymal tumors,including primary brain lymphoma, germ cell tumors, and pinealparenchymal tumors, meningiomas, metastatic tumors, paraneoplasticsyndromes, peripheral nerve sheath tumors, including schwannoma,neurofibroma, and malignant peripheral nerve sheath tumor (malignantschwannoma), and neurocutaneous syndromes (phakomatoses), includingneurofibromotosis, including Type 1 neurofibromatosis (NF1) and TYPE 2neurofibromatosis (NF2), tuberous sclerosis, and Von Hippel-Lindaudisease.

Diseases of the skin, include but are not limited to, disorders ofpigmentation and melanocytes, including but not limited to, vitiligo,freckle, melasma, lentigo, nevocellular nevus, dysplastic nevi, andmalignant melanoma; benign epithelial tumors, including but not limitedto, seborrheic keratoses, acanthosis nigricans, fibroepithelial polyp,epithelial cyst, keratoacanthoma, and adnexal (appendage) tumors;premalignant and malignant epidermal tumors, including but not limitedto, actinic keratosis, squamous cell carcinoma, basal cell carcinoma,and merkel cell carcinoma; tumors of the dermis, including but notlimited to, benign fibrous histiocytoma, dermatofibrosarcomaprotuberans, xanthomas, and dermal vascular tumors; tumors of cellularimmigrants to the skin, including but not limited to, histiocytosis X,mycosis fungoides (cutaneous T-cell lymphoma), and mastocytosis;disorders of epidermal maturation, including but not limited to,ichthyosis; acute inflammatory dermatoses, including but not limited to,urticaria, acute eczematous dermatitis, and erythema multiforme; chronicinflammatory dermatoses, including but not limited to, psoriasis, lichenplanus, and lupus erythematosus; blistering (bullous) diseases,including but not limited to, pemphigus, bullous pemphigoid, dermatitisherpetiformis, and noninflammatory blistering diseases: epidermolysisbullosa and porphyria; disorders of epidermal appendages, including butnot limited to, acne vulgaris; panniculitis, including but not limitedto, erythema nodosum and erythema induratum; and infection andinfestation, such as verrucae, molluscum contagiosum, impetigo,superficial fungal infections, and arthropod bites, stings, andinfestations.

Disorders involving the heart, include but are not limited to, heartfailure, including but not limited to, cardiac hypertrophy, left-sidedheart failure, and right-sided heart failure; ischemic heart disease,including but not limited to angina pectoris, myocardial infarction,chronic ischemic heart disease, and sudden cardiac death; hypertensiveheart disease, including but not limited to, systemic (left-sided)hypertensive heart disease and pulmonary (right-sided) hypertensiveheart disease; valvular heart disease, including but not limited to,valvular degeneration caused by calcification, such as calcific aorticstenosis, calcification of a congenitally bicuspid aortic valve, andmitral annular calcification, and myxomatous degeneration of the mitralvalve (mitral valve prolapse), rheumatic fever and rheumatic heartdisease, infective endocarditis, and noninfected vegetations, such asnonbacterial thrombotic endocarditis and endocarditis of systemic lupuserythematosus (Libman-Sacks disease), carcinoid heart disease, andcomplications of artificial valves; myocardial disease, including butnot limited to dilated cardiomyopathy, hypertrophic cardiomyopathy,restrictive cardiomyopathy, and myocarditis; pericardial disease,including but not limited to, pericardial effusion and hemopericardiumand pericarditis, including acute pericarditis and healed pericarditis,and rheumatoid heart disease; neoplastic heart disease, including butnot limited to, primary cardiac tumors, such as myxoma, lipoma,papillary fibroelastoma, rhabdomyoma, and sarcoma, and cardiac effectsof noncardiac neoplasms; congenital heart disease, including but notlimited to, left-to-right shunts—late cyanosis, such as atrial septaldefect, ventricular septal defect, patent ductus arteriosus, andatrioventricular septal defect, right-to-left shunts—early cyanosis,such as tetralogy of fallot, transposition of great arteries, truncusarteriosus, tricuspid atresia, and total anomalous pulmonary venousconnection, obstructive congenital anomalies, such as coarctation ofaorta, pulmonary stenosis and atresia, and aortic stenosis and atresia,and disorders involving cardiac transplantation.

Disorders involving blood vessels include, but are not limited to,responses of vascular cell walls to injury, such as endothelialdysfunction and endothelial activation and intimal thickening; vasculardiseases including, but not limited to, congenital anomalies, such asarteriovenous fistula, atherosclerosis, and hypertensive vasculardisease, such as hypertension; inflammatory disease—the vasculitides,such as giant cell (temporal) arteritis, Takayasu arteritis,polyarteritis nodosa (classic), Kawasaki syndrome (mucocutaneous lymphnode syndrome), microscopic polyanglitis (microscopic polyarteritis,hypersensitivity or leukocytoclastic anglitis), Wegener granulomatosis,thromboanglitis obliterans (Buerger disease), vasculitis associated withother disorders, and infectious arteritis; Raynaud disease; aneurysmsand dissection, such as abdominal aortic aneurysms, syphilitic (luetic)aneurysms, and aortic dissection (dissecting hematoma); disorders ofveins and lymphatics, such as varicose veins, thrombophlebitis andphlebothrombosis, obstruction of superior vena cava (superior vena cavasyndrome), obstruction of inferior vena cava (inferior vena cavasyndrome), and lymphangitis and lymphedema; tumors, including benigntumors and tumor-like conditions, such as hemangioma, lymphangioma,glomus tumor (glomangioma), vascular ectasias, and bacillaryangiomatosis, and intermediate-grade (borderline low-grade malignant)tumors, such as Kaposi sarcoma and hemangloendothelioma, and malignanttumors, such as angiosarcoma and hemangiopericytoma; and pathology oftherapeutic interventions in vascular disease, such as balloonangioplasty and related techniques and vascular replacement, such ascoronary artery bypass graft surgery.

Disorders involving the kidney include, but are not limited to,congenital anomalies including, but not limited to, cystic diseases ofthe kidney, that include but are not limited to, cystic renal dysplasia,autosomal dominant (adult) polycystic kidney disease, autosomalrecessive (childhood) polycystic kidney disease, and cystic diseases ofrenal medulla, which include, but are not limited to, medullary spongekidney, and nephronophthisis-uremic medullary cystic disease complex,acquired (dialysis-associated) cystic disease, such as simple cysts;glomerular diseases including pathologies of glomerular injury thatinclude, but are not limited to, in situ immune complex deposition, thatincludes, but is not limited to, anti-GBM nephritis, Heymann nephritis,and antibodies against planted antigens, circulating immune complexnephritis, antibodies to glomerular cells, cell-mediated immunity inglomerulonephritis, activation of alternative complement pathway,epithelial cell injury, and pathologies involving mediators ofglomerular injury including cellular and soluble mediators, acuteglomerulonephritis, such as acute proliferative (poststreptococcal,postinfectious) glomerulonephritis, including but not limited to,poststreptococcal glomerulonephritis and nonstreptococcal acuteglomerulonephritis, rapidly progressive (crescentic) glomerulonephritis,nephrotic syndrome, membranous glomerulonephritis (membranousnephropathy), minimal change disease (lipoid nephrosis), focal segmentalglomerulosclerosis, membranoproliferative glomerulonephritis, IgAnephropathy (Berger disease), focal proliferative and necrotizingglomerulonephritis (focal glomerulonephritis), hereditary nephritis,including but not limited to, Alport syndrome and thin membrane disease(benign familial hematuria), chronic glomerulonephritis, glomerularlesions associated with systemic disease, including but not limited to,systemic lupus erythematosus, Henoch-Schönlein purpura, bacterialendocarditis, diabetic glomerulosclerosis, amyloidosis, fibrillary andimmunotactoid glomerulonephritis, and other systemic disorders; diseasesaffecting tubules and interstitium, including acute tubular necrosis andtubulointerstitial nephritis, including but not limited to,pyelonephritis and urinary tract infection, acute pyelonephritis,chronic pyelonephritis and reflux nephropathy, and tubulointerstitialnephritis induced by drugs and toxins, including but not limited to,acute drug-induced interstitial nephritis, analgesic abuse nephropathy,nephropathy associated with nonsteroidal anti-inflammatory drugs, andother tubulointerstitial diseases including, but not limited to, uratenephropathy, hypercalcemia and nephrocalcinosis, and multiple myeloma;diseases of blood vessels including benign nephrosclerosis, malignanthypertension and accelerated nephrosclerosis, renal artery stenosis, andthrombotic microangiopathies including, but not limited to, classic(childhood) hemolytic-uremic syndrome, adult hemolytic-uremicsyndrome/thrombotic thrombocytopenic purpura, idiopathic HUS/TTP, andother vascular disorders including, but not limited to, atheroscleroticischemic renal disease, atheroembolic renal disease, sickle cell diseasenephropathy, diffuse cortical necrosis, and renal infarcts; urinarytract obstruction (obstructive uropathy); urolithiasis (renal calculi,stones); and tumors of the kidney including, but not limited to, benigntumors, such as renal papillary adenoma, renal fibroma or hamartoma(renomedullary interstitial cell tumor), angiomyolipoma, and oncocytoma,and malignant tumors, including renal cell carcinoma (hypemephroma,adenocarcinoma of kidney), which includes urothelial carcinomas of renalpelvis.

Disorders involving the testis and epididymis include, but are notlimited to, congenital anomalies such as cryptorchidism, regressivechanges such as atrophy, inflammations such as nonspecific epididymitisand orchitis, granulomatous (autoimmune) orchitis, and specificinflammations including, but not limited to, gonorrhea, mumps,tuberculosis, and syphilis, vascular disturbances including torsion,testicular tumors including germ cell tumors that include, but are notlimited to, seminoma, spermatocytic seminoma, embryonal carcinoma, yolksac tumor choriocarcinoma, teratoma, and mixed tumors, tumore of sexcord-gonadal stroma including, but not limited to, leydig (interstitial)cell tumors and sertoli cell tumors (androblastoma), and testicularlymphoma, and miscellaneous lesions of tunica vaginalis.

Disorders involving the prostate include, but are not limited to,inflammations, benign enlargement, for example, nodular hyperplasia(benign prostatic hypertrophy or hyperplasia), and tumors such ascarcinoma.

Disorders involving the skeletal muscle include tumors such asrhabdomyosarcoma.

Disorders involving the ovary include, for example, polycystic ovariandisease, Stein-leventhal syndrome, Pseudomyxoma peritonei and stromalhyperthecosis; ovarian tumors such as, tumors of coelomic epithelium,serous tumors, mucinous tumors, endometeriod tumors, clear celladenocarcinoma, cystadenofibroma, brenner tumor, surface epithelialtumors; germ cell tumors such as mature (benign) teratomas, monodermalteratomas, immature malignant teratomas, dysgerminoma, endodermal sinustumor, choriocarcinoma; sex cord-stomal tumors such as, granulosa-thecacell tumors, thecoma-fibromas, androblastomas, hill cell tumors, andgonadoblastoma; and metastatic tumors such as Krukenberg tumors.

The invention thus provides methods for treating a disordercharacterized by aberrant expression or activity of a adenylate cyclase.In one embodiment, the method involves administering an agent (e.g., anagent identified by a screening assay described herein), or combinationof agents that modulates (e.g., upregulates or down-regulates)expression or activity of the protein. In another embodiment, the methodinvolves administering the adenylate cyclase as therapy to compensatefor reduced or aberrant expression or activity of the protein.

Methods for treatment include but are not limited to the use of solubleadenylate cyclase or fragments of the adenylate cyclase protein thatcompete for ATP or GTP or G-protein. These adenylate cyclases orfragments can have a higher affinity for the target so as to provideeffective competition.

Stimulation of activity is desirable in situations in which the proteinis abnormally downregulated and/or in which increased activity is likelyto have a beneficial effect. Likewise, inhibition of activity isdesirable in situations in which the protein is abnormally upregulatedand/or in which decreased activity is likely to have a beneficialeffect. In one example of such a situation, a subject has a disordercharacterized by aberrant development or cellular differentiation. Inanother example, the subject has a proliferative disease (e.g., cancer)or a disorder characterized by an aberrant hematopoictic response. Inanother example, it is desirable to achieve tissue regeneration in asubject (e.g., where a subject has undergone brain or spinal cord injuryand it is desirable to regenerate neuronal tissue in a regulatedmanner).

The invention also provides methods for diagnosing a disease orpredisposition to disease mediated by the adenylate cyclase, including,but not limited to, diseases involving tissues in which the adenylatecyclases are expressed, as disclosed herein, and particularly inprostate, skeletal muscle, brain, testes, as well as aorta, aorta withintimal proliferations, internal mammary artery, kidney, and saphenousvein. In addition, as indicated in FIG. 51, positive differentialexpression occurs in diseased heart tissue from patients with myopathyand ischemia. In view of these results, in one embodiment of theinvention, these disorders are treated by modulating the level oractivity of the adenylate cyclase gene in diseased hearts. Treatment istherefore especially directed to these tissues and cells thereof.Likewise, in one embodiment, diagnosis is directed to cells and tissuesinvolved in these disorders. As mentioned above, treatment and diagnosiscan be in human subjects in which the disease normally occurs and inmodel systems, both in vitro and in vivo, such as in transgenic animals.

Accordingly, methods are directed to detecting the presence, or levelsof, the adenylate cyclase in a cell, tissue, or organism. The methodsinvolve contacting a biological sample with a compound capable ofinteracting with the adenylate cyclase such that the interaction can bedetected.

One agent for detecting adenylate cyclase is an antibody capable ofselectively binding to adenylate cyclase. A biological sample includestissues, cells and biological fluids isolated from a subject, as well astissues, cells and fluids present within a subject.

The invention also provides methods for diagnosing active disease, orpredisposition to disease, in a patient having a variant adenylatecyclase. Thus, adenylate cyclase can be isolated from a biologicalsample and assayed for the presence of a genetic mutation that resultsin an aberrant protein. This includes amino acid substitution, deletion,insertion, rearrangement, (as the result of aberrant splicing events),and inappropriate post-translational modification. Analytic methodsinclude altered electrophoretic mobility, altered tryptic peptidedigest, altered adenylate cyclase activity in cell-based or cell-freeassay, alteration in ATP or GTP binding or cyclization, G-proteinsubunit binding or calmodulin or antibody-binding pattern, alteredisoelectric point, direct amino acid sequencing, and any other of theknown assay techniques useful for detecting mutations in a protein ingeneral or in a adenylate cyclase specifically.

In vitro techniques for detection of adenylate cyclase include enzymelinked immunosorbent assays (ELISAs), Western blots,immunoprecipitations and immunofluorescence. Alternatively, the proteincan be detected in vivo in a subject by introducing into the subject alabeled anti-adenylate cyclase antibody. For example, the antibody canbe labeled with a radioactive marker whose presence and location in asubject can be detected by standard imaging techniques. Particularlyuseful are methods, which detect the allelic variant of the adenylatecyclase expressed in a subject, and methods, which detect fragments ofthe adenylate cyclase in a sample.

The invention also provides methods of pharmacogenomic analysisincluding, but not limited to, in the cells, tissues and disordersdisclosed herein in which expression of the adenylate cyclase eitheroccurs or shows differential expression. Pharmacogenomics deal withclinically significant hereditary variations in the response to drugsdue to altered drug disposition and abnormal action in affected persons.See, e.g., Eichelbaum, M. (1996) Clin. Exp. Pharmacol. Physiol.23(10-11):983-985, and Linder, M. W. (1997) Clin. Chem. 43(2):254-266.The clinical outcomes of these variations result in severe toxicity oftherapeutic drugs in certain individuals or therapeutic failure of drugsin certain individuals as a result of individual variation inmetabolism. Thus, the genotype of the individual can determine the way atherapeutic compound acts on the body or the way the body metabolizesthe compound. Further, the activity of drug metabolizing enzymes affectsboth the intensity and duration of drug action. Thus, thepharmacogenomics of the individual permit the selection of effectivecompounds and effective dosages of such compounds for prophylactic ortherapeutic treatment based on the individual's genotype. The discoveryof genetic polymorphisms in some drug metabolizing enzymes has explainedwhy some patients do not obtain the expected drug effects, show anexaggerated drug effect, or experience serious toxicity from standarddrug dosages. Polymorphisms can be expressed in the phenotype of theextensive metabolizer and the phenotype of the poor metabolizer.Accordingly, genetic polymorphism may lead to allelic protein variantsof the adenylate cyclase in which one or more of the adenylate cyclasefunctions in one population is different from those in anotherpopulation. The polypeptides can be used as a target to ascertain agenetic predisposition that can affect treatment modality. Thus, in aGTP- or ATP-based treatment, polymorphism may give rise to catalyticregions that are more or less active. Accordingly, dosage wouldnecessarily be modified to maximize the therapeutic effect within agiven population containing the polymorphism. As an alternative togenotyping, specific polymorphic polypeptides could be identified.

The invention also provides for monitoring therapeutic effects duringclinical trials and other treatment. Thus, the therapeutic effectivenessof an agent that is designed to increase or decrease gene expression,protein levels or adenylate cyclase activity can be monitored over thecourse of treatment using the adenylate cyclase polypeptides as anend-point target. The monitoring can be, for example, as follows: (i)obtaining a pre-administration sample from a subject prior toadministration of the agent; (ii) detecting the level of expression oractivity of the protein in the pre-administration sample; (iii)obtaining one or more post-administration samples from the subject; (iv)detecting the level of expression or activity of the protein in thepost-administration samples; (v) comparing the level of expression oractivity of the protein in the pre-administration sample with theprotein in the post-administration sample or samples; and (vi)increasing or decreasing the administration of the agent to the subjectaccordingly.

Polypeptides

The methods and uses herein disclosed can be based on polypeptidereagents and targets. The invention is thus based on the discovery of anovel human adenylate cyclase. Specifically, an expressed sequence tag(EST) was selected based on homology to adenylate cyclase sequences.This EST was used to design primers based on sequences that it containsand used to identify a cDNA from a fetal testis cDNA library. Positiveclones were sequenced and the overlapping fragments were assembled.Analysis of the assembled sequence revealed that the cloned cDNAmolecule encodes an adenylate cyclase similar to a rat adenylatecyclase.

The invention thus relates to a novel human adenylate cyclase and to theexpression of the adenylate cyclase having the deduced amino acidsequence shown in FIGS. 46A-46D (SEQ ID NO:19) or having the amino acidsequence encoded by the cDNA insert of the plasmid deposited with theATCC as Patent Deposit Number PTA-1871.

The deposits will be maintained under the terms of the Budapest Treatyon the International Recognition of the Deposit of Microorganisms. Thedeposits are provided as a convenience to those of skill in the art andare not an admission that a deposit is required under 35 U.S.C. § 112.The deposited sequences as well as the polypeptides encoded by thesequences, are incorporated herein by reference and control in the eventof any conflict, such as a sequencing error, with description in thisapplication.

“Adenylate cyclase polypeptide” or “adenylate cyclase protein” refers tothe polypeptide in SEQ ID NO:19 or encoded by the deposited cDNA. Theterm “adenylate cyclase protein” or “adenylate cyclase polypeptide”,however, further includes the numerous variants described herein, aswell as fragments derived from the full-length adenylate cyclases andvariants.

Tissues and/or cells in which the adenylate cyclase is found include,but are not limited to those shown in FIGS. 50 and 51, and particularlyin prostate, skeletal muscle, brain, testis and aorta. In addition, theadenylate cyclase is expressed in diseased tissues, including butlimited to, heart tissue derived from patients with myopathy orischemia.

The present invention thus provides an isolated or purified adenylatecyclase polypeptide and variants and fragments thereof.

Based on a BLAST search, high homology was shown to adenyl cyclase fromrat, CYA2 Type II (EC 4.6.1.1) (ATP pyrophosphate-lyase), SwissProt Acc.No. P26769, and a rat adenyl cyclase, PATENT Acc. No. R94560.

As used herein, a polypeptide is said to be “isolated” or “purified”when it is substantially free of cellular material, when it is isolatedfrom recombinant and non-recombinant cells, or free of chemicalprecursors or other chemicals when it is chemically synthesized. Apolypeptide, however, can be joined to another polypeptide with which itis not normally associated in a cell and still be considered “isolated”or “purified.”

The adenylate cyclase polypeptides can be purified to homogeneity. It isunderstood, however, that preparations in which the polypeptide is notpurified to homogeneity are useful and considered to contain an isolatedform of the polypeptide. The critical feature is that the preparationallows for the desired function of the polypeptide, even in the presenceof considerable amounts of other components. Thus, the inventionencompasses various degrees of purity.

In one embodiment, the language “substantially free of cellularmaterial” includes preparations of the adenylate cyclase having lessthan about 30% (by dry weight) other proteins (i.e., contaminatingprotein), less than about 20% other proteins, less than about 10% otherproteins, or less than about 5% other proteins. When the polypeptide isrecombinantly produced, it can also be substantially free of culturemedium, i.e., culture medium represents less than about 20%, less thanabout 10%, or less than about 5% of the volume of the proteinpreparation.

An adenylate cyclase polypeptide is also considered to be isolated whenit is part of a membrane preparation or is purified and thenreconstituted with membrane vesicles or liposomes.

The language “substantially free of chemical precursors or otherchemicals” includes preparations of the adenylate cyclase polypeptide inwhich it is separated from chemical precursors or other chemicals thatare involved in its synthesis. In one embodiment, the language“substantially free of chemical precursors or other chemicals” includespreparations of the polypeptide having less than about 30% (by dryweight) chemical precursors or other chemicals, less than about 20%chemical precursors or other chemicals, less than about 10% chemicalprecursors or other chemicals, or less than about 5% chemical precursorsor other chemicals.

In one embodiment, the adenylate cyclase polypeptide comprises the aminoacid sequence shown in SEQ ID NO:19. However, the invention alsoencompasses sequence variants. Variants include a substantiallyhomologous protein encoded by the same genetic locus in an organism,i.e., an allelic variant.

Variants also encompass proteins derived from other genetic loci in anorganism, but having substantial homology to the adenylate cyclase ofSEQ ID NO:19. Variants also include proteins substantially homologous tothe adenylate cyclase but derived from another organism, i.e., anortholog. Variants also include proteins that are substantiallyhomologous to the adenylate cyclase that are produced by chemicalsynthesis. Variants also include proteins that are substantiallyhomologous to the adenylate cyclase that are produced by recombinantmethods. It is understood, however, that variants exclude any amino acidsequences disclosed prior to the invention.

As used herein, two proteins (or a region of the proteins) aresubstantially homologous when the amino acid sequences are at leastabout 70-75%, typically at least about 80-85%, and most typically atleast about 90-95% or more homologous. A substantially homologous aminoacid sequence, according to the present invention, will be encoded by anucleic acid sequence hybridizing to the nucleic acid sequence, orportion thereof, of the sequence shown in SEQ ID NO:20 under stringentconditions as more fully described below.

To determine the percent identity of two amino acid sequences or of twonucleic acid sequences, the sequences are aligned for optimal comparisonpurposes (e.g., gaps can be introduced in one or both of a first and asecond amino acid or nucleic acid sequence for optimal alignment andnon-homologous sequences can be disregarded for comparison purposes). Ina preferred embodiment, the length of a reference sequence aligned forcomparison purposes is at least 30%, preferably at least 40%, morepreferably at least 50%, even more preferably at least 60%, and evenmore preferably at least 70%, 80%, or 90% or more of the length of thereference sequence. The amino acid residues or nucleotides atcorresponding amino acid positions or nucleotide positions are thencompared. When a position in the first sequence is occupied by the sameamino acid residue or nucleotide as the corresponding position in thesecond sequence, then the molecules are identical at that position (asused herein amino acid or nucleic acid “identity” is equivalent to aminoacid or nucleic acid “homology”). The percent identity between the twosequences is a function of the number of identical positions shared bythe sequences, taking into account the number of gaps, and the length ofeach gap, which need to be introduced for optimal alignment of the twosequences.

The invention also encompasses polypeptides having a lower degree ofidentity but having sufficient similarity so as to perform one or moreof the same functions performed by the adenylate cyclase. Similarity isdetermined by conserved amino acid substitution. Such substitutions arethose that substitute a given amino acid in a polypeptide by anotheramino acid of like characteristics. Conservative substitutions arelikely to be phenotypically silent. Typically seen as conservativesubstitutions are the replacements, one for another, among the aliphaticamino acids Ala, Val, Leu, and Ile; interchange of the hydroxyl residuesSer and Thr, exchange of the acidic residues Asp and Glu, substitutionbetween the amide residues Asn and Gln, exchange of the basic residuesLys and Arg and replacements among the aromatic residues Phe, Tyr.Guidance concerning which amino acid changes are likely to bephenotypically silent are found in Bowie et al., Science 247:1306-1310(1990).

TABLE 1 Conservative Amino Acid Substitutions. Aromatic PhenylalanineTryptophan Tyrosine Hydrophobic Leucine Isoleucine Valine PolarGlutamine Asparagine Basic Arginine Lysine Histidine Acidic AsparticAcid Glutamic Acid Small Alanine Serine Threonine Methionine Glycine

The comparison of sequences and determination of percent identity andsimilarity between two sequences can be accomplished using amathematical algorithm. (Computational Molecular Biology, Lesk, A. M.,ed., Oxford University Press, New York, 1988; Biocomputing: Informaticsand Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993;Computer Analysis of Sequence Data, Part 1, Griffin, A. M., and Griffin,H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis inMolecular Biology, von Heinje, G., Academic Press, 1987; and SequenceAnalysis Primer, Gribskov, M. and Devereux, J., eds., M Stockton Press,New York, 1991).

A preferred, non-limiting example of such a mathematical algorithm isdescribed in Karlin et al. (1993) Proc. Natl. Acad. Sci. USA90:5873-5877. Such an algorithm is incorporated into the NBLAST andXBLAST programs (version 2.0) as described in Altschul et al. (1997)Nucleic Acids Res. 25:3389-3402. When utilizing BLAST and Gapped BLASTprograms, the default parameters of the respective programs (e.g.,NBLAST) can be used. See www.ncbi.nlm.nih.gov. In one embodiment,parameters for sequence comparison can be set at score=100,wordlength=12, or can be varied (e.g., W=5 or W=20).

In a preferred embodiment, the percent identity between two amino acidsequences is determined using the Needleman et al. (1970) (J. Mol. Biol.48:444-453) algorithm which has been incorporated into the GAP programin the GCG software package (available at www.gcg.com), using either aBLOSUM 62 matrix or a PAM250 matrix, and a gap weight of 16, 14, 12, 10,8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6. In yet anotherpreferred embodiment, the percent identity between two nucleotidesequences is determined using the GAP program in the GCG softwarepackage (Devereux et al. (1984) Nucleic Acids Res. 12(1):387) (availableat www.gcg.com), using a NWSgapdna.CMP matrix and a gap weight of 40,50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6.

Another preferred, non-limiting example of a mathematical algorithmutilized for the comparison of sequences is the algorithm of Myers andMiller, CABIOS (1989). Such an algorithm is incorporated into the ALIGNprogram (version 2.0) which is part of the CGC sequence alignmentsoftware package. When utilizing the ALIGN program for comparing aminoacid sequences, a PAM120 weight residue table, a gap length penalty of12, and a gap penalty of 4 can be used. Additional algorithms forsequence analysis are known in the art and include ADVANCE and ADAM asdescribed in Torellis et al. (1994) Comput. Appl. Biosci. 10:3-5; andFASTA described in Pearson et al. (1988) PNAS 85:2444-8.

A variant polypeptide can differ in amino acid sequence by one or moresubstitutions, deletions, insertions, inversions, fusions, andtruncations or a combination of any of these.

Variant polypeptides can be fully functional or can lack function in oneor more activities. Thus, in the present case, variations can affect thefunction, for example, of one or more of the regions corresponding to acatalytic region, regulatory region, targeting region, regions involvedin membrane association, regions involved in enzyme activation, forexample, by phosphorylation, and regions involved in interaction withcomponents of the cyclic nucleotide-dependent signal transductionpathways, (e.g., ATP, GTP, G-protein, or calmodulin).

Fully functional variants typically contain only conservative variationor variation in non-critical residues or in non-critical regions.Functional variants can also contain substitution of similar aminoacids, which results in no change or an insignificant change infunction. Alternatively, such substitutions may positively or negativelyaffect function to some degree.

Non-functional variants typically contain one or more non-conservativeamino acid substitutions, deletions, insertions, inversions, ortruncation or a substitution, insertion, inversion, or deletion in acritical residue or critical region.

As indicated, variants can be naturally-occurring or can be made byrecombinant means or chemical synthesis to provide useful and novelcharacteristics for the adenylate cyclase polypeptide. This includespreventing immunogenicity from pharmaceutical formulations by preventingprotein aggregation.

Useful variations further include alteration of catalytic activity. Forexample, one embodiment involves a variation at the binding site thatresults in binding but not cyclization, or slower cyclization, of ATP orGTP. A further useful variation at the same site can result in alteredaffinity for ATP or GTP. Useful variation includes one that preventsactivation by G-protein. Another useful variation provides a fusionprotein in which one or more domains or subregions are operationallyfused to one or more domains or subregions from another adenylatecyclase isoform or family.

Amino acids that are essential for function can be identified by methodsknown in the art, such as site-directed mutagenesis or alanine-scanningmutagenesis (Cunningham et al. (1985) Science 244:1081-1085). The latterprocedure introduces single alanine mutations at every residue in themolecule. The resulting mutant molecules are then tested for biologicalactivity, such as ATP or GTP cyclization in vitro or cGMP- orcAMP-dependent in vitro activity, such as proliferative activity. Sitesthat are critical for GTP or ATP or G-protein binding can also bedetermined by structural analysis such as crystallization, nuclearmagnetic resonance or photoaffinity labeling (Smith et al. (1992) J.Mol. Biol. 224:899-904; de Vos et al. (1992) Science 255:306-312).

Substantial homology can be to the entire nucleic acid or amino acidsequence or to fragments of these sequences.

The invention thus also includes polypeptide fragments of the adenylatecyclase. Fragments can be derived from the amino acid sequence shown inSEQ ID NO:19. However, the invention also encompasses fragments of thevariants of the adenylate cyclase as described herein.

The fragments to which the invention pertains, however, are not to beconstrued as encompassing fragments per se that may have been disclosedprior to the invention (although the methods herein can pertain to knownfragments).

Accordingly, a fragment can comprise at least about 10, 15, 20, 25, 30,35, 40, 45, 50 or more contiguous amino acids. Fragments can retain oneor more of the biological activities of the protein, for example theability to bind to or cyclize GTP or ATP, as well as fragments that canbe used as an immunogen to generate adenylate cyclase antibodies.

Biologically active fragments (peptides which are, for example, 5, 7,10, 12, 15, 20, 30, 35, 36, 37, 38, 39, 40, 50, 100 or more amino acidsin length) can comprise a domain or motif, e.g., catalytic site,adenylate cyclase signature, and sites for glycosylation, protein kinaseC phosphorylation, casein kinase II phosphorylation, tyrosine kinasephosphorylation, and N-myristoylation. Further possible fragmentsinclude the catalytic site, sites important for cellular and subcellulartargeting, sites functional for interacting with components of othercGMP or cAMP-dependent signal transduction pathways, and regulatorysites.

Such domains or motifs can be identified by means of routinecomputerized homology searching procedures.

Fragments, for example, can extend in one or both directions from thefunctional site to encompass 5, 10, 15, 20, 30, 40, 50, or up to 100amino acids. Further, fragments can include sub-fragments of thespecific domains mentioned above, which sub-fragments retain thefunction of the domain from which they are derived.

These regions can be identified by well-known methods involvingcomputerized homology analysis.

The invention also provides fragments with immunogenic properties. Thesecontain an epitope-bearing portion of the adenylate cyclase andvariants. These epitope-bearing peptides are useful to raise antibodiesthat bind specifically to a adenylate cyclase polypeptide or region orfragment. These peptides can contain at least 10, 12, at least 14, orbetween at least about 15 to about 30 amino acids.

Non-limiting examples of antigenic polypeptides that can be used togenerate antibodies include but are not limited to peptides derived froman extracellular site. Regions having a high antigenicity index areshown in FIG. 48. However, intracellularly-made antibodies(“intrabodies”) are also encompassed, which would recognizeintracellular peptide regions.

The epitope-bearing adenylate cyclase polypeptides may be produced byany conventional means (Houghten, R. A. (1985) Proc. Natl. Acad. Sci.USA 82:5131-5135). Simultaneous multiple peptide synthesis is describedin U.S. Pat. No. 4,631,211.

Fragments can be discrete (not fused to other amino acids orpolypeptides) or can be within a larger polypeptide. Further, severalfragments can be comprised within a single larger polypeptide. In oneembodiment a fragment designed for expression in a host can haveheterologous pre- and pro-polypeptide regions fused to the aminoterminus of the adenylate cyclase fragment and an additional regionfused to the carboxyl terminus of the fragment.

The invention thus provides chimeric or fusion proteins. These comprisea adenylate cyclase peptide sequence operatively linked to aheterologous peptide having an amino acid sequence not substantiallyhomologous to the adenylate cyclase. “Operatively linked” indicates thatthe adenylate cyclase peptide and the heterologous peptide are fusedin-frame. The heterologous peptide can be fused to the N-terminus orC-terminus of the adenylate cyclase or can be internally located.

In one embodiment the fusion protein does not affect adenylate cyclasefunction per se. For example, the fusion protein can be a GST-fusionprotein in which the adenylate cyclase sequences are fused to the N- orC-terminus of the GST sequences. Other types of fusion proteins include,but are not limited to, enzymatic fusion proteins, for examplebeta-galactosidase fusions, yeast two-hybrid GAL-4 fusions, poly-Hisfusions and Ig fusions. Such fusion proteins, particularly poly-Hisfusions, can facilitate the purification of recombinant adenylatecyclase. In certain host cells (e.g., mammalian host cells), expressionand/or secretion of a protein can be increased by using a heterologoussignal sequence. Therefore, in another embodiment, the fusion proteincontains a heterologous signal sequence at its N-terminus.

EP-A-O 464 533 discloses fusion proteins comprising various portions ofimmunoglobulin constant regions. The Fc is useful in therapy anddiagnosis and thus results, for example, in improved pharmacokineticproperties (EP-A 0232 262). In drug discovery, for example, humanproteins have been fused with Fc portions for the purpose ofhigh-throughput screening assays to identify antagonists (Bennett et al.(1995) J. Mol. Recog. 8:52-58 (1995) and Johanson et al. J. Biol. Chem.270:9459-9471). Thus, this invention also encompasses soluble fusionproteins containing a adenylate cyclase polypeptide and various portionsof the constant regions of heavy or light chains of immunoglobulins ofvarious subclass (IgG, IgM, IgA, IgE). Preferred as immunoglobulin isthe constant part of the heavy chain of human IgG, particularly IgG1,where fusion takes place at the hinge region. For some uses it isdesirable to remove the Fc after the fusion protein has been used forits intended purpose, for example when the fusion protein is to be usedas antigen for immunizations. In a particular embodiment, the Fc partcan be removed in a simple way by a cleavage sequence, which is alsoincorporated and can be cleaved with factor Xa.

A chimeric or fusion protein can be produced by standard recombinant DNAtechniques. For example, DNA fragments coding for the different proteinsequences are ligated together in-frame in accordance with conventionaltechniques. In another embodiment, the fusion gene can be synthesized byconventional techniques including automated DNA synthesizers.Alternatively, PCR amplification of gene fragments can be carried outusing anchor primers which give rise to complementary overhangs betweentwo consecutive gene fragments which can subsequently be annealed andre-amplified to generate a chimeric gene sequence (see Ausubel et al.(1992) Current Protocols in Molecular Biology). Moreover, manyexpression vectors are commercially available that already encode afusion moiety (e.g., a GST protein). An adenylate cyclase-encodingnucleic acid can be cloned into such an expression vector such that thefusion moiety is linked in-frame to the adenylate cyclase.

Another form of fusion protein is one that directly affects adenylatecyclase functions. Accordingly, a adenylate cyclase polypeptide isencompassed by the present invention in which one or more of theadenylate cyclase domains (or parts thereof) has been replaced byhomologous domains (or parts thereof) from another adenylate cyclasefamily. Accordingly, various permutations are possible. For example, theaminoterminal regulatory domain, or subregion thereof, can be replacedwith the domain or subregion from another isoform or adenylate cyclasefamily. As a further example, the catalytic domain or parts thereof, canbe replaced; the carboxyterminal domain or subregion can be replaced.Thus, chimeric adenylate cyclases can be formed in which one or more ofthe native domains or subregions has been replaced by another.

Additionally, chimeric adenylate cyclase proteins can be produced inwhich one or more functional sites is derived from a different isoform,or from another adenylate cyclase family. It is understood, however,that sites could be derived from adenylate cyclase families that occurin the mammalian genome but which have not yet been discovered orcharacterized. Such sites include but are not limited to a catalyticsite, regulatory site, sites important for targeting to subcellular andcellular locations, sites functional for interaction with components ofcyclic AMP- and cyclic GMP-dependent signal transduction pathways,phosphorylation sites, glycosylation sites, and other functional sitesdisclosed herein.

The isolated adenylate cyclase can be purified from cells that naturallyexpress it, such as from those shown in FIGS. 50 and 51 and/orspecifically disclosed herein above, among others, especially purifiedfrom cells that have been altered to express it (recombinant), orsynthesized using known protein synthesis methods.

In one embodiment, the protein is produced by recombinant DNAtechniques. For example, a nucleic acid molecule encoding the adenylatecyclase polypeptide is cloned into an expression vector, the expressionvector introduced into a host cell and the protein expressed in the hostcell. The protein can then be isolated from the cells by an appropriatepurification scheme using standard protein purification techniques.

Polypeptides often contain amino acids other than the 20 amino acidscommonly referred to as the 20 naturally-occurring amino acids. Further,many amino acids, including the terminal amino acids, may be modified bynatural processes, such as processing and other post-translationalmodifications, or by chemical modification techniques well known in theart. Common modifications that occur naturally in polypeptides aredescribed in basic texts, detailed monographs, and the researchliterature, and they are well known to those of skill in the art.

Accordingly, the polypeptides also encompass derivatives or analogs inwhich a substituted amino acid residue is not one encoded by the geneticcode, in which a substituent group is included, in which the maturepolypeptide is fused with another compound, such as a compound toincrease the half-life of the polypeptide (for example, polyethyleneglycol), or in which the additional amino acids are fused to the maturepolypeptide, such as a leader or secretory sequence or a sequence forpurification of the mature polypeptide or a pro-protein sequence.

Known modifications include, but are not limited to, acetylation,acylation, ADP-ribosylation, amidation, covalent attachment of flavin,covalent attachment of a heme moiety, covalent attachment of anucleotide or nucleotide derivative, covalent attachment of a lipid orlipid derivative, covalent attachment of phosphatidylinositol,cross-linking, cyclization, disulfide bond formation, demethylation,formation of covalent crosslinks, formation of cystine, formation ofpyroglutamate, formylation, gamma carboxylation, glycosylation, GPIanchor formation, hydroxylation, iodination, methylation,myristoylation, oxidation, proteolytic processing, phosphorylation,prenylation, racemization, selenoylation, sulfation, transfer-RNAmediated addition of amino acids to proteins such as arginylation, andubiquitination.

Such modifications are well-known to those of skill in the art and havebeen described in great detail in the scientific literature. Severalparticularly common modifications, glycosylation, lipid attachment,sulfation, gamma-carboxylation of glutamic acid residues, hydroxylationand ADP-ribosylation, for instance, are described in most basic texts,such as Proteins—Structure and Molecular Properties, 2nd ed., T.E.Creighton, W.H. Freeman and Company, New York (1993). Many detailedreviews are available on this subject, such as by Wold, F.,Posttranslational Covalent Modification of Proteins, B. C. Johnson, Ed.,Academic Press, New York 1-12 (1983); Seifter et al. (1990) Meth.Enzymol. 182: 626-646) and Rattan et al. (1992) Ann. N.Y. Acad. Sci.663:48-62).

As is also well known, polypeptides are not always entirely linear. Forinstance, polypeptides may be branched as a result of ubiquitination,and they may be circular, with or without branching, generally as aresult of post-translation events, including natural processing eventsand events brought about by human manipulation which do not occurnaturally. Circular, branched and branched circular polypeptides may besynthesized by non-translational natural processes and by syntheticmethods.

Modifications can occur anywhere in a polypeptide, including the peptidebackbone, the amino acid side-chains and the amino or carboxyl termini.Blockage of the amino or carboxyl group in a polypeptide, or both, by acovalent modification, is common in naturally-occurring and syntheticpolypeptides. For instance, the aminoterminal residue of polypeptidesmade in E. coli, prior to proteolytic processing, almost invariably willbe N-formylmethionine.

The modifications can be a function of how the protein is made. Forrecombinant polypeptides, for example, the modifications will bedetermined by the host cell posttranslational modification capacity andthe modification signals in the polypeptide amino acid sequence.Accordingly, when glycosylation is desired, a polypeptide should beexpressed in a glycosylating host, generally a eukaryotic cell. Insectcells often carry out the same posttranslational glycosylations asmammalian cells and, for this reason, insect cell expression systemshave been developed to efficiently express mammalian proteins havingnative patterns of glycosylation. Similar considerations apply to othermodifications.

The same type of modification may be present in the same or varyingdegree at several sites in a given polypeptide. Also, a givenpolypeptide may contain more than one type of modification.

Methods Using the Antibodies

Methods for using antibodies as disclosed herein are particularlyapplicable to the cells, tissues and disorders shown in FIGS. 50 and 51and as otherwise discussed herein above.

The invention provides methods using antibodies that selectively bind tothe adenylate cyclase and its variants and fragments. An antibody isconsidered to selectively bind, even if it also binds to other proteinsthat are not substantially homologous with the adenylate cyclase. Theseother proteins share homology with a fragment or domain of the adenylatecyclase. This conservation in specific regions gives rise to antibodiesthat bind to both proteins by virtue of the homologous sequence. In thiscase, it would be understood that antibody binding to the adenylatecyclase is still selective.

The invention provides methods of using antibodies to isolate aadenylate cyclase by standard techniques, such as affinitychromatography or immunoprecipitation. The antibodies can facilitate thepurification of the adenylate cyclase from cells naturally expressing itand cells recombinantly producing it.

The antibodies can be used to detect the presence of adenylate cyclasein cells or tissues to determine the pattern of expression of theadenylate cyclase among various tissues in an organism and over thecourse of normal development.

The antibodies can be used to detect adenylate cyclase in situ, invitro, or in a cell lysate or supernatant in order to evaluate theabundance and pattern of expression.

The antibodies can be used to assess abnormal tissue distribution orabnormal expression during development.

Antibody detection of circulating fragments of the full-length adenylatecyclase can be used to identify adenylate cyclase turnover.

Further, the antibodies can be used to assess adenylate cyclaseexpression in disease states such as in active stages of the disease orin an individual with a predisposition toward disease related toadenylate cyclase function. When a disorder is caused by aninappropriate tissue distribution, developmental expression, or level ofexpression of the adenylate cyclase protein, the antibody can beprepared against the normal adenylate cyclase protein. If a disorder ischaracterized by a specific mutation in the adenylate cyclase,antibodies specific for this mutant protein can be used to assay for thepresence of the specific mutant adenylate cyclase. However,intracellularly-made antibodies (“intrabodies”) are also encompassed,which would recognize intracellular adenylate cyclase peptide regions.

The antibodies can also be used to assess normal and aberrantsubcellular localization in cells in the various tissues in an organism.Antibodies can be developed against the whole adenylate cyclase orportions of the adenylate cyclase.

The diagnostic uses can be applied, not only in genetic testing, butalso in monitoring a treatment modality. Accordingly, where treatment isultimately aimed at correcting adenylate cyclase expression level or thepresence of aberrant adenylate cyclases and aberrant tissue distributionor developmental expression, antibodies directed against the adenylatecyclase or relevant fragments can be used to monitor therapeuticefficacy.

Antibodies accordingly can be used diagnostically to monitor proteinlevels in tissue as part of a clinical testing procedure, e.g., to, forexample, determine the efficacy of a given treatment regimen.

Additionally, antibodies are useful in pharmacogenomic analysis. Thus,antibodies prepared against polymorphic adenylate cyclase can be used toidentify individuals that require modified treatment modalities.

Antibodies can also be used in diagnostic procedures as an immunologicalmarker for aberrant adenylate cyclase analyzed by electrophoreticmobility, isoelectric point, tryptic peptide digest, and other physicalassays known to those in the art.

The antibodies are also useful for tissue typing. Thus, where theadenylate cyclase is expressed in a specific tissue, antibodies that arespecific for this adenylate cyclase can be used to identify the tissuetype.

The antibodies are also useful in forensic identification. Accordingly,where an individual has been correlated with a specific geneticpolymorphism resulting in a specific polymorphic protein, an antibodyspecific for the polymorphic protein can be used as an aid inidentification.

The antibodies are also useful for inhibiting adenylate cyclasefunction, for example, blocking binding of GTP or ATP, G-protein, or thecatalytic site.

These uses can also be applied in a therapeutic context in whichtreatment involves inhibiting adenylate cyclase function. An antibodycan be used, for example, to block ATP or GTP binding. Antibodies can beprepared against specific fragments containing sites required forfunction or against intact adenylate cyclase.

Completely human antibodies are particularly desirable for therapeutictreatment of human patients. For an overview of this technology forproducing human antibodies, see Lonberg et al., (1995) Int. Rev.Immunol. 13:65-93. For a detailed discussion of this technology forproducing human antibodies and human monoclonal antibodies and protocolsfor producing such antibodies, e.g., U.S. Pat. No. 5,625,126; U.S. Pat.No. 5,633,425; U.S. Pat. No. 5,569,825; U.S. Pat. No. 5,661,016; andU.S. Pat. No. 5,545,806.

The invention also encompasses kits for using antibodies to detect thepresence of a adenylate cyclase protein in a biological sample. The kitcan comprise antibodies such as a labeled or labelable antibody and acompound or agent for detecting adenylate cyclase in a biologicalsample; means for determining the amount of adenylate cyclase in thesample; and means for comparing the amount of adenylate cyclase in thesample with a standard. The compound or agent can be packaged in asuitable container. The kit can further comprise instructions for usingthe kit to detect adenylate cyclase.

Antibodies

The methods for using antibodies described above are based on thegeneration of antibodies that specifically bind to the adenylate cyclaseor its variants or fragments.

To generate antibodies, an isolated adenylate cyclase polypeptide isused as an immunogen to generate antibodies using standard techniquesfor polyclonal and monoclonal antibody preparation. Either thefull-length protein or antigenic peptide fragment can be used. Regionshaving a high antigenicity index are shown in FIG. 48.

Antibodies are preferably prepared from these regions or from discretefragments in these regions. However, antibodies can be prepared from anyregion of the peptide as described herein. A preferred fragment producesan antibody that diminishes or completely prevents G-protein ATP or GTPbinding. Antibodies can be developed against the entire adenylatecyclase or domains of the adenylate cyclase as described herein.Antibodies can also be developed against specific functional sites asdisclosed herein.

The antigenic peptide can comprise a contiguous sequence of at least 12,14, 15, or 30 amino acid residues. In one embodiment, fragmentscorrespond to regions that are located on the surface of the protein,e.g., hydrophilic regions. These fragments are not to be construed,however, as encompassing any fragments, which may be disclosed prior tothe invention.

Antibodies can be polyclonal or monoclonal. An intact antibody, or afragment thereof (e.g., Fab or F(ab′)₂) can be used.

Detection can be facilitated by coupling (i.e., physically linking) theantibody to a detectable substance. Examples of detectable substancesinclude various enzymes, prosthetic groups, fluorescent materials,luminescent materials, bioluminescent materials, and radioactivematerials. Examples of suitable enzymes include horseradish peroxidase,alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examplesof suitable prosthetic group complexes include streptavidin/biotin andavidin/biotin; examples of suitable fluorescent materials includeumbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine,dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; anexample of a luminescent material includes luminol; examples ofbioluminescent materials include luciferase, luciferin, and aequorin,and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or³H.

An appropriate immunogenic preparation can be derived from native,recombinantly expressed, or chemically synthesized peptides.

Methods Using the Polynucleotides

The nucleotide sequences of the present invention can be used as a“query sequence” to perform a search against public databases, forexample, to identify other family members or related sequences. Suchsearches can be performed using the NBLAST and XBLAST programs (version2.0) of Altschul et al. (1990) J. Mol. Biol. 215:403-10. BLAST proteinsearches can be performed with the XBLAST program, score=50,wordlength=3 to obtain amino acid sequences homologous to the proteinsof the invention. To obtain gapped alignments for comparison purposes,gapped BLAST can be utilized as described in Altschul et al. (1997)Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST and GappedBLAST programs, the default parameters of the respective programs (e.g.,XBLAST and NBLAST) can be used. See www.ncbi.nlm.nih.gov.

The methods and uses described herein below for the adenylate cyclasepolynucleotide are particularly applicable to the cells, tissues, anddisorders shown in FIGS. 50 and 51, and specifically discussed hereinabove.

The nucleic acid fragments useful to practice the invention provideprobes or primers in assays, such as those described herein. “Probes”are oligonucleotides that hybridize in a base-specific manner to acomplementary strand of nucleic acid. Such probes include polypeptidenucleic acids, as described in Nielsen et al. (1991) Science254:1497-1500. Typically, a probe comprises a region of nucleotidesequence that hybridizes under highly stringent conditions to at leastabout 15, typically about 20-25, and more typically about 40, 50 or 75consecutive nucleotides of the nucleic acid sequence shown in SEQ IDNO:20 and the complements thereof. More typically, the probe furthercomprises a label, e.g., radioisotope, fluorescent compound, enzyme, orenzyme co-factor.

As used herein, the term “primer” refers to a single-strandedoligonucleotide which acts as a point of initiation of template-directedDNA synthesis using well-known methods (e.g., PCR, LCR) including, butnot limited to those described herein. The appropriate length of theprimer depends on the particular use, but typically ranges from about 15to 30 nucleotides. The term “primer site” refers to the area of thetarget DNA to which a primer hybridizes. The term “primer pair” refersto a set of primers including a 5′ (upstream) primer that hybridizeswith the 5′ end of the nucleic acid sequence to be amplified and a 3′(downstream) primer that hybridizes with the complement of the sequenceto be amplified.

The adenylate cyclase polynucleotides can be utilized as probes andprimers in biological assays.

Where the polynucleotides are used to assess adenylate cyclaseproperties or functions, such as in the assays described herein, all orless than all of the entire cDNA can be useful. Assays specificallydirected to adenylate cyclase functions, such as assessing agonist orantagonist activity, encompass the use of known fragments. Further,diagnostic methods for assessing adenylate cyclase function can also bepracticed with any fragment, including those fragments that may havebeen known prior to the invention. Similarly, in methods involvingtreatment of adenylate cyclase dysfunction, all fragments areencompassed including those, which may have been known in the art.

The invention utilizes the adenylate cyclase polynucleotides as ahybridization probe for cDNA and genomic DNA to isolate a full-lengthcDNA and genomic clones encoding variant polypeptides and to isolatecDNA and genomic clones that correspond to variants producing the samepolypeptides shown in SEQ ID NO:19 or the other variants describedherein. Variants can be isolated from the same tissue and organism fromwhich the polypeptide shown in SEQ ID NO:19 was isolated, differenttissues from the same organism, or from different organisms. This methodis useful for isolating variant genes and cDNA that are developmentallycontrolled and therefore may be expressed in the same tissue ordifferent tissues at different points in the development of an organism.This method is useful for isolating variant genes and cDNA that areexpressed in the cells, tissues, and disorders disclosed herein.

The probe can correspond to any sequence along the entire length of thegene encoding the adenylate cyclase. Accordingly, it could be derivedfrom 5′ noncoding regions, the coding region, and 3′ noncoding regions.

The nucleic acid probe can be, for example, the full-length cDNA of SEQID NO:20, or a fragment thereof, such as an oligonucleotide of at least12, 15, 30, 50, 100, 250 or 500 nucleotides in length and sufficient tospecifically hybridize under stringent conditions to mRNA or DNA.

Fragments of the polynucleotides described herein can also be used tosynthesize larger fragments or full-length polynucleotides describedherein. For example, a fragment can be hybridized to any portion of anmRNA and a larger or full-length cDNA can be produced.

Fragments can also be used to synthesize antisense molecules of desiredlength and sequence.

Antisense nucleic acids, useful in treatment and diagnosis, can bedesigned using the nucleotide sequences of SEQ ID NO:20, and constructedusing chemical synthesis and enzymatic ligation reactions usingprocedures known in the art. For example, an antisense nucleic acid(e.g., an antisense oligonucleotide) can be chemically synthesized usingnaturally occurring nucleotides or variously modified nucleotidesdesigned to increase the biological stability of the molecules or toincrease the physical stability of the duplex formed between theantisense and sense nucleic acids, e.g., phosphorothioate derivativesand acridine substituted nucleotides can be used. Examples of modifiednucleotides which can be used to generate the antisense nucleic acidinclude 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil,hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl)uracil, 5-carboxymethylaminomethyl-2-thiouridine,5-carboxymethylaminomethyluracil, dihydrouracil,beta-D-galactosylqueosine, inosine, N6-isopentenyladenine,1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine,2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine,7-methylguanine, 5-methylaminomethyluracil,5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine,5′-methoxycarboxymethyluracil, 5-methoxyuracil,2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v),wybutoxosine, pseudouracil, queosine, 2-thiocytosine,5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil,uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v),5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w,and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can beproduced biologically using an expression vector into which a nucleicacid has been subcloned in an antisense orientation (i.e., RNAtranscribed from the inserted nucleic acid will be of an antisenseorientation to a target nucleic acid of interest).

Additionally, the nucleic acid molecules useful to practice theinvention can be modified at the base moiety, sugar moiety or phosphatebackbone to improve, e.g., the stability, hybridization, or solubilityof the molecule. For example, the deoxyribose phosphate backbone of thenucleic acids can be modified to generate peptide nucleic acids (seeHyrup et al. (1996) Bioorganic & Medicinal Chemistry 4:5). As usedherein, the terms “peptide nucleic acids” or “PNAs” refer to nucleicacid mimics, e.g., DNA mimics, in which the deoxyribose phosphatebackbone is replaced by a pseudopeptide backbone and only the fournatural nucleobases are retained. The neutral backbone of PNAs has beenshown to allow for specific hybridization to DNA and RNA underconditions of low ionic strength. The synthesis of PNA oligomers can beperformed using standard solid phase peptide synthesis protocols asdescribed in Hyrup et al. (1996), supra; Perry-O'Keefe et al. (1996)Proc. Natl. Acad. Sci. USA 93:14670. PNAs can be further modified, e.g.,to enhance their stability, specificity or cellular uptake, by attachinglipophilic or other helper groups to PNA, by the formation of PNA-DNAchimeras, or by the use of liposomes or other techniques of drugdelivery known in the art. The synthesis of PNA-DNA chimeras can beperformed as described in Hyrup (1996), supra, Finn et al. (1996)Nucleic Acids Res. 24(17):3357-63, Mag et al. (1989) Nucleic Acids Res.17:5973, and Peterser et al. (1975) Bioorganic Med. Chem. Lett. 5:1119.

The nucleic acid molecules and fragments useful to practice theinvention can also include other appended groups such as peptides (e.g.,for targeting host cell adenylate cyclases in vivo), or agentsfacilitating transport across the cell membrane (see, e.g., Letsinger etal. (1989) Proc. Natl. Acad. Sci. USA 86:6553-6556; Lemaitre et al.(1987) Proc. Natl. Acad. Sci. USA 84:648-652; PCT Publication No. WO88/0918) or the blood brain barrier (see, e.g., PCT Publication No. WO89/10134). In addition, oligonucleotides can be modified withhybridization-triggered cleavage agents (see, e.g., Krol et al. (1988)Bio-Techniques 6:958-976) or intercalating agents (see, e.g., Zon (1988)Pharm Res. 5:539-549).

The adenylate cyclase polynucleotides can also be used as primers forPCR to amplify any given region of a adenylate cyclase polynucleotide.

The adenylate cyclase polynucleotides can also be used to constructrecombinant vectors. Such vectors include expression vectors thatexpress a portion of, or all of, the adenylate cyclase polypeptides.Vectors also include insertion vectors, used to integrate into anotherpolynucleotide sequence, such as into the cellular genome, to alter insitu expression of adenylate cyclase genes and gene products. Forexample, an endogenous adenylate cyclase coding sequence can be replacedvia homologous recombination with all or part of the coding regioncontaining one or more specifically introduced mutations.

The adenylate cyclase polynucleotides can also be used to expressantigenic portions of the adenylate cyclase protein.

The adenylate cyclase polynucleotides can also be used as probes fordetermining the chromosomal positions of the adenylate cyclasepolynucleotides by means of in situ hybridization methods, such as FISH.(For a review of this technique, see Verma et al. (1988) HumanChromosomes: A Manual of Basic Techniques (Pergamon Press, New York),and PCR mapping of somatic cell hybrids. The mapping of the sequence tochromosomes is important in correlating these sequences with genesassociated with disease, especially where translocations and/oramplification have occurred.

Reagents for chromosome mapping can be used individually to mark asingle chromosome or a single site on that chromosome, or panels ofreagents can be used for marking multiple sites and/or multiplechromosomes. Reagents corresponding to noncoding regions of the genesactually are preferred for mapping purposes. Coding sequences are morelikely to be conserved within gene families, thus increasing the chanceof cross hybridizations during chromosomal mapping.

Once a sequence has been mapped to a precise chromosomal location, thephysical position of the sequence on the chromosome can be correlatedwith genetic map data. (Such data are found, for example, in V.McKusick, Mendelian Inheritance in Man, available on-line through JohnsHopkins University Welch Medical Library). The relationship between agene and a disease mapped to the same chromosomal region, can then beidentified through linkage analysis (co-inheritance of physicallyadjacent genes), described in, for example, Egeland et al. ((1987)Nature 325:783-787).

Moreover, differences in the DNA sequences between individuals affectedand unaffected with a disease associated with a specified gene, can bedetermined. If a mutation is observed in some or all of the affectedindividuals but not in any unaffected individuals, then the mutation islikely to be the causative agent of the particular disease. Comparisonof affected and unaffected individuals generally involves first lookingfor structural alterations in the chromosomes, such as deletions ortranslocations, that are visible from chromosome spreads, or detectableusing PCR based on that DNA sequence. Ultimately, complete sequencing ofgenes from several individuals can be performed to confirm the presenceof a mutation and to distinguish mutations from polymorphisms.

The adenylate cyclase polynucleotide probes can also be used todetermine patterns of the presence of the gene encoding the adenylatecyclase with respect to tissue distribution, for example, whether geneduplication has occurred and whether the duplication occurs in all oronly a subset of cells in a tissue. The genes can be naturally occurringor can have been introduced into a cell, tissue, or organismexogenously.

The adenylate cyclase polynucleotides can also be used to designribozymes corresponding to all, or a part, of the mRNA produced fromgenes encoding the polynucleotides described herein, the ribozymes beinguseful to treat or diagnose a disorder or otherwise modulate expressionof the nucleic acid.

The adenylate cyclase polynucleotides can also be used to make vectorsthat express part, or all, of the adenylate cyclase polypeptides.

The adenylate cyclase polynucleotides can also be used to construct hostcells expressing a part, or all, of the adenylate cyclasepolynucleotides and polypeptides.

The adenylate cyclase polynucleotides can also be used to constructtransgenic animals expressing all, or a part, of the adenylate cyclasepolynucleotides and polypeptides.

The adenylate cyclase polynucleotides can also be used as hybridizationprobes to determine the level of adenylate cyclase nucleic acidexpression. Accordingly, the probes can be used to detect the presenceof, or to determine levels of, adenylate cyclase nucleic acid in cells,tissues, and in organisms. DNA or RNA level can be determined. Probescan be used to assess gene copy number in a given cell, tissue, ororganism. This is particularly relevant in cases in which there has beenan amplification of the adenylate cyclase gene.

Alternatively, the probe can be used in an in situ hybridization contextto assess the position of extra copies of the adenylate cyclase gene, ason extrachromosomal elements or as integrated into chromosomes in whichthe adenylate cyclase gene is not normally found, for example, as ahomogeneously staining region.

These uses are relevant for diagnosis of disorders involving an increaseor decrease in adenylate cyclase expression relative to normal, such asa proliferative disorder, a differentiative or developmental disorder,or a hematopoietic disorder, such as in the cells and tissues shown inFIGS. 50 and 51 and otherwise specifically discussed herein. Thus in oneembodiment, disorders include diseases of the heart, such as myopathyand ischemia.

Thus, the present invention provides a method for identifying a diseaseor disorder associated with aberrant expression or activity of adenylatecyclase nucleic acid, in which a test sample is obtained from a subjectand nucleic acid (e.g., mRNA, genomic DNA) is detected, wherein thepresence of the nucleic acid is diagnostic for a subject having or atrisk of developing a disease or disorder associated with aberrantexpression or activity of the nucleic acid.

One aspect of the invention relates to diagnostic assays for determiningnucleic acid expression as well as activity in the context of abiological sample (e.g., blood, serum, cells, tissue) to determinewhether an individual has a disease or disorder, or is at risk ofdeveloping a disease or disorder, associated with aberrant nucleic acidexpression or activity. Such assays can be used for prognostic orpredictive purpose to thereby prophylactically treat an individual priorto the onset of a disorder characterized by or associated withexpression or activity of the nucleic acid molecules.

In vitro techniques for detection of mRNA include Northernhybridizations and in situ hybridizations. In vitro techniques fordetecting DNA includes Southern hybridizations and in situhybridization.

Probes can be used as a part of a diagnostic test kit for identifyingcells or tissues that express the adenylate cyclase, such as bymeasuring the level of a adenylate cyclase-encoding nucleic acid in asample of cells from a subject e.g., mRNA or genomic DNA, or determiningif the adenylate cyclase gene has been mutated.

Nucleic acid expression assays are useful for drug screening to identifycompounds that modulate adenylate cyclase nucleic acid expression (e.g.,antisense, polypeptides, peptidomimetics, small molecules or otherdrugs). A cell is contacted with a candidate compound and the expressionof mRNA determined. The level of expression of the mRNA in the presenceof the candidate compound is compared to the level of expression of themRNA in the absence of the candidate compound. The candidate compoundcan then be identified as a modulator of nucleic acid expression basedon this comparison and be used, for example to treat a disordercharacterized by aberrant nucleic acid expression. The modulator canbind to the nucleic acid or indirectly modulate expression, such as byinteracting with other cellular components that affect nucleic acidexpression.

Modulatory methods can be performed in vitro (e.g., by culturing thecell with the agent) or, alternatively, in vivo (e.g., by administeringthe gene to a subject) in patients or in transgenic animals.

The invention thus provides a method for identifying a compound that canbe used to treat a disorder associated with expression of the adenylatecyclase gene. The method typically includes assaying the ability of thecompound to modulate the expression of the adenylate cyclase nucleicacid and thus identifying a compound that can be used to treat adisorder characterized by excessive or deficient adenylate cyclasenucleic acid expression.

The assays can be performed in cell-based and cell-free systems, such assystems using the tissues described herein, in which the gene isexpressed or in model systems for the disorders to which the inventionpertains. Cell-based assays include cells naturally expressing theadenylate cyclase nucleic acid or recombinant cells geneticallyengineered to express specific nucleic acid sequences.

Alternatively, candidate compounds can be assayed in vivo in patients orin transgenic animals.

The assay for adenylate cyclase nucleic acid expression can involvedirect assay of nucleic acid levels, such as mRNA levels, or oncollateral compounds involved in the signal pathway (such as cAMP orcGMP turnover). Further, the expression of genes that are up- ordown-regulated in response to the adenylate cyclase signal pathway canalso be assayed. In this embodiment the regulatory regions of thesegenes can be operably linked to a reporter gene such as luciferase.

Thus, modulators of adenylate cyclase gene expression can be identifiedin a method wherein a cell is contacted with a candidate compound andthe expression of mRNA determined. The level of expression of adenylatecyclase mRNA in the presence of the candidate compound is compared tothe level of expression of adenylate cyclase mRNA in the absence of thecandidate compound. The candidate compound can then be identified as amodulator of nucleic acid expression based on this comparison and beused, for example to treat a disorder characterized by aberrant nucleicacid expression. When expression of mRNA is statistically significantlygreater in the presence of the candidate compound than in its absence,the candidate compound is identified as a stimulator of nucleic acidexpression. When nucleic acid expression is statistically significantlyless in the presence of the candidate compound than in its absence, thecandidate compound is identified as an inhibitor of nucleic acidexpression.

Accordingly, the invention provides methods of treatment, with thenucleic acid as a target, using a compound identified through drugscreening as a gene modulator to modulate adenylate cyclase nucleic acidexpression. Modulation includes both up-regulation (i.e. activation oragonization) or down-regulation (suppression or antagonization) oreffects on nucleic acid activity (e.g., when nucleic acid is mutated orimproperly modified). Treatment is of disorders characterized byaberrant expression or activity of the nucleic acid.

The gene is particularly relevant for the treatment of disordersinvolving the tissues shown in FIGS. 50 and 51, particularly inprostate, skeletal muscle, brain, and testes, as well as tissues andcells involved in myopathy and ischemia.

Alternatively, a modulator for adenylate cyclase nucleic acid expressioncan be a small molecule or drug identified using the screening assaysdescribed herein as long as the drug or small molecule inhibits theadenylate cyclase nucleic acid expression.

The adenylate cyclase polynucleotides are also useful for monitoring theeffectiveness of modulating compounds on the expression or activity ofthe adenylate cyclase gene in clinical trials or in a treatment regimen.Thus, the gene expression pattern can serve as a barometer for thecontinuing effectiveness of treatment with the compound, particularlywith compounds to which a patient can develop resistance. The geneexpression pattern can also serve as a marker indicative of aphysiological response of the affected cells to the compound.Accordingly, such monitoring would allow either increased administrationof the compound or the administration of alternative compounds to whichthe patient has not become resistant. Similarly, if the level of nucleicacid expression falls below a desirable level, administration of thecompound could be commensurately decreased.

Monitoring can be, for example, as follows: (i) obtaining apre-administration sample from a subject prior to administration of theagent; (ii) detecting the level of expression of a specified mRNA orgenomic DNA of the invention in the pre-administration sample; (iii)obtaining one or more post-administration samples from the subject; (iv)detecting the level of expression or activity of the mRNA or genomic DNAin the post-administration samples; (v) comparing the level ofexpression or activity of the mRNA or genomic DNA in thepre-administration sample with the mRNA or genomic DNA in thepost-administration sample or samples; and (vi) increasing or decreasingthe administration of the agent to the subject accordingly.

The adenylate cyclase polynucleotides can be used in diagnostic assaysfor qualitative changes in adenylate cyclase nucleic acid, andparticularly in qualitative changes that lead to pathology. Thepolynucleotides can be used to detect mutations in adenylate cyclasegenes and gene expression products such as mRNA. The polynucleotides canbe used as hybridization probes to detect naturally-occurring geneticmutations in the adenylate cyclase gene and thereby to determine whethera subject with the mutation is at risk for a disorder caused by themutation. Mutations include deletion, addition, or substitution of oneor more nucleotides in the gene, chromosomal rearrangement, such asinversion or transposition, modification of genomic DNA, such asaberrant methylation patterns or changes in gene copy number, such asamplification. Detection of a mutated form of the adenylate cyclase geneassociated with a dysfunction provides a diagnostic tool for an activedisease or susceptibility to disease when the disease results fromoverexpression, underexpression, or altered expression of a adenylatecyclase.

Mutations in the adenylate cyclase gene can be detected at the nucleicacid level by a variety of techniques. Genomic DNA can be analyzeddirectly or can be amplified by using PCR prior to analysis. RNA or cDNAcan be used in the same way.

In certain embodiments, detection of the mutation involves the use of aprobe/primer in a polymerase chain reaction (PCR) (see, e.g., U.S. Pat.Nos. 4,683,195 and 4,683,202), such as anchor PCR or RACE PCR, or,alternatively, in a ligation chain reaction (LCR) (see, e.g., Landegranet al. (1988) Science 241:1077-1080; and Nakazawa et al. (1994) PNAS91:360-364), the latter of which can be particularly useful fordetecting point mutations in the gene (see Abravaya et al. (1995)Nucleic Acids Res. 23:675-682). This method can include the steps ofcollecting a sample of cells from a patient, isolating nucleic acid(e.g., genomic, mRNA or both) from the cells of the sample, contactingthe nucleic acid sample with one or more primers which specificallyhybridize to a gene under conditions such that hybridization andamplification of the gene (if present) occurs, and detecting thepresence or absence of an amplification product, or detecting the sizeof the amplification product and comparing the length to a controlsample. Deletions and insertions can be detected by a change in size ofthe amplified product compared to the normal genotype. Point mutationscan be identified by hybridizing amplified DNA to normal RNA orantisense DNA sequences.

It is anticipated that PCR and/or LCR may be desirable to use as apreliminary amplification step in conjunction with any of the techniquesused for detecting mutations described herein.

Alternative amplification methods include: self sustained sequencereplication (Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA87:1874-1878), transcriptional amplification system (Kwoh et al. (1989)Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi etal. (1988) Bio/Technology 6:1197), or any other nucleic acidamplification method, followed by the detection of the amplifiedmolecules using techniques well-known to those of skill in the art.These detection schemes are especially useful for the detection ofnucleic acid molecules if such molecules are present in very lownumbers.

Alternatively, mutations in a adenylate cyclase gene can be directlyidentified, for example, by alterations in restriction enzyme digestionpatterns determined by gel electrophoresis.

Further, sequence-specific ribozymes (U.S. Pat. No. 5,498,531) can beused to score for the presence of specific mutations by development orloss of a ribozyme cleavage site.

Perfectly matched sequences can be distinguished from mismatchedsequences by nuclease cleavage digestion assays or by differences inmelting temperature.

Sequence changes at specific locations can also be assessed by nucleaseprotection assays such as RNase and S1 protection or the chemicalcleavage method.

Furthermore, sequence differences between a mutant adenylate cyclasegene and a wild-type gene can be determined by direct DNA sequencing. Avariety of automated sequencing procedures can be utilized whenperforming the diagnostic assays ((1995) Biotechniques 19:448),including sequencing by mass spectrometry (see, e.g., PCT InternationalPublication No. WO 94/16101; Cohen et al. (1996) Adv. Chromatogr.36:127-162; and Griffin et al. (1993) Appl. Biochem. Biotechnol.38:147-159).

Other methods for detecting mutations in the gene include methods inwhich protection from cleavage agents is used to detect mismatched basesin RNA/RNA or RNA/DNA duplexes (Myers et al. (1985) Science 230:1242);Cotton et al. (1988) PNAS 85:4397; Saleeba et al. (1992) Meth. Enzymol.217:286-295), electrophoretic mobility of mutant and wild type nucleicacid is compared (Orita et al. (1989) PNAS 86:2766; Cotton et al. (1993)Mutat. Res. 285:125-144; and Hayashi et al. (1992) Genet. Anal. Tech.Appl. 9:73-79), and movement of mutant or wild-type fragments inpolyacrylamide gels containing a gradient of denaturant is assayed usingdenaturing gradient gel electrophoresis (Myers et al. (1985) Nature313:495). The sensitivity of the assay may be enhanced by using RNA(rather than DNA), in which the secondary structure is more sensitive toa change in sequence. In one embodiment, the subject method utilizesheteroduplex analysis to separate double stranded heteroduplex moleculeson the basis of changes in electrophoretic mobility (Keen et al. (1991)Trends Genet. 7:5). Examples of other techniques for detecting pointmutations include, selective oligonucleotide hybridization, selectiveamplification, and selective primer extension.

In other embodiments, genetic mutations can be identified by hybridizinga sample and control nucleic acids, e.g., DNA or RNA, to high densityarrays containing hundreds or thousands of oligonucleotide probes(Cronin et al. (1996) Human Mutation 7:244-255; Kozal et al. (1996)Nature Medicine 2:753-759). For example, genetic mutations can beidentified in two-dimensional arrays containing light-generated DNAprobes as described in Cronin et al. supra. Briefly, a firsthybridization array of probes can be used to scan through long stretchesof DNA in a sample and control to identify base changes between thesequences by making linear arrays of sequential overlapping probes. Thisstep allows the identification of point mutations. This step is followedby a second hybridization array that allows the characterization ofspecific mutations by using smaller, specialized probe arrayscomplementary to all variants or mutations detected. Each mutation arrayis composed of parallel probe sets, one complementary to the wild-typegene and the other complementary to the mutant gene.

The adenylate cyclase polynucleotides can also be used for testing anindividual for a genotype that while not necessarily causing thedisease, nevertheless affects the treatment modality. Thus, thepolynucleotides can be used to study the relationship between anindividual's genotype and the individual's response to a compound usedfor treatment (pharmacogenomic relationship). In the present case, forexample, a mutation in the adenylate cyclase gene that results inaltered affinity for ATP or GTP could result in an excessive ordecreased drug effect with standard concentrations of ATP or GTP.Accordingly, the adenylate cyclase polynucleotides described herein canbe used to assess the mutation content of the gene in an individual inorder to select an appropriate compound or dosage regimen for treatment.

Thus polynucleotides displaying genetic variations that affect treatmentprovide a diagnostic target that can be used to tailor treatment in anindividual. Accordingly, the production of recombinant cells and animalscontaining these polymorphisms allow effective clinical design oftreatment compounds and dosage regimens.

The methods can involve obtaining a control biological sample from acontrol subject, contacting the control sample with a compound or agentcapable of detecting mRNA, or genomic DNA, such that the presence ofmRNA or genomic DNA is detected in the biological sample, and comparingthe presence of mRNA or genomic DNA in the control sample with thepresence of mRNA or genomic DNA in the test sample.

The adenylate cyclase polynucleotides are also useful for chromosomeidentification when the sequence is identified with an individualchromosome and to a particular location on the chromosome. First, theDNA sequence is matched to the chromosome by in situ or otherchromosome-specific hybridization. Sequences can also be correlated tospecific chromosomes by preparing PCR primers that can be used for PCRscreening of somatic cell hybrids containing individual chromosomes fromthe desired species. Only hybrids containing the chromosome containingthe gene homologous to the primer will yield an amplified fragment.Sublocalization can be achieved using chromosomal fragments. Otherstrategies include prescreening with labeled flow-sorted chromosomes andpreselection by hybridization to chromosome-specific libraries. Furthermapping strategies include fluorescence in situ hybridization, whichallows hybridization with probes shorter than those traditionally used.Reagents for chromosome mapping can be used individually to mark asingle chromosome or a single site on the chromosome, or panels ofreagents can be used for marking multiple sites and/or multiplechromosomes. Reagents corresponding to noncoding regions of the genesactually are preferred for mapping purposes. Coding sequences are morelikely to be conserved within gene families, thus increasing the chanceof cross hybridizations during chromosomal mapping.

The adenylate cyclase polynucleotides can also be used to identifyindividuals from small biological samples. This can be done for exampleusing restriction fragment-length polymorphism (RFLP) to identify anindividual. Thus, the polynucleotides described herein are useful as DNAmarkers for RFLP (See U.S. Pat. No. 5,272,057).

Furthermore, the adenylate cyclase sequence can be used to provide analternative technique, which determines the actual DNA sequence ofselected fragments in the genome of an individual. Thus, the adenylatecyclase sequences described herein can be used to prepare two PCRprimers from the 5′ and 3′ ends of the sequences. These primers can thenbe used to amplify DNA from an individual for subsequent sequencing.

Panels of corresponding DNA sequences from individuals prepared in thismanner can provide unique individual identifications, as each individualwill have a unique set of such DNA sequences. It is estimated thatallelic variation in humans occurs with a frequency of about once pereach 500 bases. Allelic variation occurs to some degree in the codingregions of these sequences, and to a greater degree in the noncodingregions. The adenylate cyclase sequences can be used to obtain suchidentification sequences from individuals and from tissue. The sequencesrepresent unique fragments of the human genome. Each of the sequencesdescribed herein can, to some degree, be used as a standard againstwhich DNA from an individual can be compared for identificationpurposes.

If a panel of reagents from the sequences is used to generate a uniqueidentification database for an individual, those same reagents can laterbe used to identify tissue from that individual. Using the uniqueidentification database, positive identification of the individual,living or dead, can be made from extremely small tissue samples.

The adenylate cyclase polynucleotides can also be used in forensicidentification procedures. PCR technology can be used to amplify DNAsequences taken from very small biological samples, such as a singlehair follicle, body fluids (e.g. blood, saliva, or semen). The amplifiedsequence can then be compared to a standard allowing identification ofthe origin of the sample.

The adenylate cyclase polynucleotides can thus be used to providepolynucleotide reagents, e.g., PCR primers, targeted to specific loci inthe human genome, which can enhance the reliability of DNA-basedforensic identifications by, for example, providing another“identification marker” (i.e. another DNA sequence that is unique to aparticular individual). As described above, actual base sequenceinformation can be used for identification as an accurate alternative topatterns formed by restriction enzyme generated fragments. Sequencestargeted to the noncoding region are particularly useful since greaterpolymorphism occurs in the noncoding regions, making it easier todifferentiate individuals using this technique.

The adenylate cyclase polynucleotides can further be used to providepolynucleotide reagents, e.g., labeled or labelable probes which can beused in, for example, an in situ hybridization technique, to identify aspecific tissue. This is useful in cases in which a forensic pathologistis presented with a tissue of unknown origin. Panels of adenylatecyclase probes can be used to identify tissue by species and/or by organtype.

In a similar fashion, these primers and probes can be used to screentissue culture for contamination (i.e., screen for the presence of amixture of different types of cells in a culture).

Alternatively, the adenylate cyclase polynucleotides can be useddirectly to block transcription or translation of adenylate cyclase genesequences by means of antisense or ribozyme constructs. Thus, in adisorder characterized by abnormally high or undesirable adenylatecyclase gene expression, nucleic acids can be directly used fortreatment.

The adenylate cyclase polynucleotides are thus useful as antisenseconstructs to control adenylate cyclase gene expression in cells,tissues, and organisms. A DNA antisense polynucleotide is designed to becomplementary to a region of the gene involved in transcription,preventing transcription and hence production of adenylate cyclaseprotein. An antisense RNA or DNA polynucleotide would hybridize to themRNA and thus block translation of mRNA into adenylate cyclase protein.

Examples of antisense molecules useful to inhibit nucleic acidexpression include antisense molecules complementary to a fragment ofthe 5′ untranslated region of SEQ ID NO:20 which also includes the startcodon and antisense molecules which are complementary to a fragment ofthe 3′ untranslated region of SEQ ID NO:20.

Alternatively, a class of antisense molecules can be used to inactivatemRNA in order to decrease expression of adenylate cyclase nucleic acid.Accordingly, these molecules can treat a disorder characterized byabnormal or undesired adenylate cyclase nucleic acid expression. Thistechnique involves cleavage by means of ribozymes containing nucleotidesequences complementary to one or more regions in the mRNA thatattenuate the ability of the mRNA to be translated. Possible regionsinclude coding regions and particularly coding regions corresponding tothe catalytic and other functional activities of the adenylate cyclaseprotein.

The adenylate cyclase polynucleotides also provide vectors for genetherapy in patients containing cells that are aberrant in adenylatecyclase gene expression. Thus, recombinant cells, which include thepatient's cells that have been engineered ex vivo and returned to thepatient, are introduced into an individual where the cells produce thedesired adenylate cyclase protein to treat the individual.

The invention also encompasses kits for detecting the presence of aadenylate cyclase nucleic acid in a biological sample. For example, thekit can comprise reagents such as a labeled or labelable nucleic acid oragent capable of detecting adenylate cyclase nucleic acid in abiological sample; means for determining the amount of adenylate cyclasenucleic acid in the sample; and means for comparing the amount ofadenylate cyclase nucleic acid in the sample with a standard. Thecompound or agent can be packaged in a suitable container. The kit canfurther comprise instructions for using the kit to detect adenylatecyclase mRNA or DNA.

Polynucleotides

The nucleotide sequence in SEQ ID NO:20 was obtained by sequencing thedeposited human cDNA. Accordingly, the sequence of the deposited cloneis controlling as to any discrepancies between the two and any referenceto the sequence of SEQ ID NO:20 includes reference to the sequence ofthe deposited cDNA.

The specifically disclosed cDNA comprises the coding region and 5′ and3′ untranslated sequences in SEQ ID NO:20.

The invention provides isolated polynucleotides encoding the adenylatecyclase. The term “adenylate cyclase polynucleotide” or “adenylatecyclase nucleic acid” refers to the sequence shown in SEQ ID NO:20 or inthe deposited cDNA. The term “adenylate cyclase polynucleotide” or“adenylate cyclase nucleic acid” further includes variants and fragmentsof the adenylate cyclase polynucleotides.

The methods and uses described herein can be based on the adenylatecyclase polynucleotide as a reagent or as a target.

The invention thus provides methods and uses for the nucleotide sequencein SEQ ID NO:20.

An “isolated” adenylate cyclase nucleic acid is one that is separatedfrom other nucleic acid present in the natural source of the adenylatecyclase nucleic acid. Preferably, an “isolated” nucleic acid is free ofsequences which naturally flank the adenylate cyclase nucleic acid(i.e., sequences located at the 5′ and 3′ ends of the nucleic acid) inthe genomic DNA of the organism from which the nucleic acid is derived.However, there can be some flanking nucleotide sequences, for example upto about 5 KB. The important point is that the adenylate cyclase nucleicacid is isolated from flanking sequences such that it can be subjectedto the specific manipulations described herein, such as recombinantexpression, preparation of probes and primers, and other uses specificto the adenylate cyclase nucleic acid sequences.

Moreover, an “isolated” nucleic acid molecule, such as a cDNA or RNAmolecule, can be substantially free of other cellular material, orculture medium when produced by recombinant techniques, or chemicalprecursors or other chemicals when chemically synthesized. However, thenucleic-acid molecule can be fused to other coding or regulatorysequences and still be considered isolated.

In some instances, the isolated material will form part of a composition(for example, a crude extract containing other substances), buffersystem or reagent mix. In other circumstances, the material may bepurified to essential homogeneity, for example as determined by PAGE orcolumn chromatography such as HPLC. Preferably, an isolated nucleic acidcomprises at least about 50, 80 or 90% (on a molar basis) of allmacromolecular species present.

For example, recombinant DNA molecules contained in a vector areconsidered isolated. Further examples of isolated DNA molecules includerecombinant DNA molecules maintained in heterologous host cells orpurified (partially or substantially) DNA molecules in solution.Isolated RNA molecules include in vivo or in vitro RNA transcripts ofthe isolated DNA molecules of the present invention. Isolated nucleicacid molecules according to the present invention further include suchmolecules produced synthetically.

In some instances, the isolated material will form part of a composition(or example, a crude extract containing other substances), buffer systemor reagent mix. In other circumstances, the material may be purified toessential homogeneity, for example as determined by PAGE or columnchromatography such as HPLC. Preferably, an isolated nucleic acidcomprises at least about 50, 80 or 90% (on a molar basis) of allmacromolecular species present.

The adenylate cyclase polynucleotides can encode the mature protein plusadditional amino or carboxyterminal amino acids, or amino acids interiorto the mature polypeptide (when the mature form has more than onepolypeptide chain, for instance). Such sequences may play a role inprocessing of a protein from precursor to a mature form, facilitateprotein trafficking, prolong or shorten protein half-life or facilitatemanipulation of a protein for assay or production, among other things.As generally is the case in situ, the additional amino acids may beprocessed away from the mature protein by cellular enzymes.

The adenylate cyclase polynucleotides include, but are not limited to,the sequence encoding the mature polypeptide alone, the sequenceencoding the mature polypeptide and additional coding sequences, such asa leader or secretory sequence (e.g., a pre-pro or pro-proteinsequence), the sequence encoding the mature polypeptide, with or withoutthe additional coding sequences, plus additional non-coding sequences,for example introns and non-coding 5′ and 3′ sequences such astranscribed but non-translated sequences that play a role intranscription, mRNA processing (including splicing and polyadenylationsignals), ribosome binding and stability of mRNA. In addition, thepolynucleotide may be fused to a marker sequence encoding, for example,a peptide that facilitates purification.

Adenylate cyclase polynucleotides can be in the form of RNA, such asmRNA, or in the form DNA, including cDNA and genomic DNA obtained bycloning or produced by chemical synthetic techniques or by a combinationthereof. The nucleic acid, especially DNA, can be double-stranded orsingle-stranded. Single-stranded nucleic acid can be the coding strand(sense strand) or the non-coding strand (anti-sense strand).

In one embodiment, the adenylate cyclase nucleic acid comprises only thecoding region.

The invention further provides variant adenylate cyclasepolynucleotides, and fragments thereof, that differ from the nucleotidesequence shown in SEQ ID NO:20 due to degeneracy of the genetic code andthus encode the same protein as that encoded by the nucleotide sequenceshown in SEQ ID NO:20.

The invention also provides adenylate cyclase nucleic acid moleculesencoding the variant polypeptides described herein. Such polynucleotidesmay be naturally occurring, such as allelic variants (same locus),homologs (different locus), and orthologs (different organism), or maybe constructed by recombinant DNA methods or by chemical synthesis. Suchnon-naturally occurring variants may be made by mutagenesis techniques,including those applied to polynucleotides, cells, or organisms.Accordingly, as discussed above, the variants can contain nucleotidesubstitutions, deletions, inversions and insertions.

Typically, variants have a substantial identity with a nucleic acidmolecule of SEQ ID NO:20 and the complements thereof. Variation canoccur in either or both the coding and non-coding regions. Thevariations can produce both conservative and non-conservative amino acidsubstitutions.

Orthologs, homologs, and allelic variants can be identified usingmethods well known in the art. These variants comprise a nucleotidesequence encoding a adenylate cyclase that is at least about 60-65%,65-70%, typically at least about 70-75%, more typically at least about80-85%, and most typically at least about 90-95% or more homologous tothe nucleotide sequence shown in SEQ ID NO:20 or a fragment of thissequence. Such nucleic acid molecules can readily be identified as beingable to hybridize under stringent conditions, to the nucleotide sequenceshown in SEQ ID NO:20 or a fragment of the sequence. It is understoodthat stringent hybridization does not indicate substantial homologywhere it is due to general homology, such as poly A sequences, orsequences common to all or most proteins, or all adenylate cyclases.Moreover, variants per se do not include any nucleic acid (or aminoacid) sequence disclosed prior to the present invention, although themethods herein can encompass such variants.

As used herein, the term “hybridizes under stringent conditions” isintended to describe conditions for hybridization and washing underwhich nucleotide sequences encoding a polypeptide at least about 60-65%homologous to each other typically remain hybridized to each other. Theconditions can be such that sequences at least about 65%, at least about70%, at least about 75%, at least about 80%, at least about 90%, atleast about 95% or more identical to each other remain hybridized to oneanother. Such stringent conditions are known to those skilled in the artand can be found in Current Protocols in Molecular Biology, John Wiley &Sons, N.Y. (1989), 6.3.1-6.3.6, incorporated by reference. One exampleof stringent hybridization conditions are hybridization in 6× sodiumchloride/sodium citrate (SSC) at about 45° C., followed by one or morewashes in 0.2×SSC, 0.1% SDS at 50-65° C. In another non-limitingexample, nucleic acid molecules are allowed to hybridize in 6× sodiumchloride/sodium citrate (SSC) at about 45° C., followed by one or morelow stringency washes in 0.2×SSC/0.1% SDS at room temperature, or by oneor more moderate stringency washes in 0.2×SSC/0.1% SDS at 42° C., orwashed in 0.2×SSC/0.1% SDS at 65° C. for high stringency. In oneembodiment, an isolated nucleic acid molecule that hybridizes understringent conditions to the sequence of SEQ ID NO:19 corresponds to anaturally-occurring nucleic acid molecule. As used herein, a“naturally-occurring” nucleic acid molecule refers to an RNA or DNAmolecule having a nucleotide sequence that occurs in nature (e.g.,encodes a natural protein).

As understood by those of ordinary skill, the exact conditions can bedetermined empirically and depend on ionic strength, temperature and theconcentration of destabilizing agents such as formamide or denaturingagents such as SDS. Other factors considered in determining the desiredhybridization conditions include the length of the nucleic acidsequences, base composition, percent mismatch between the hybridizingsequences and the frequency of occurrence of subsets of the sequenceswithin other non-identical sequences. Thus, equivalent conditions can bedetermined by varying one or more of these parameters while maintaininga similar degree of identity or similarity between the two nucleic acidmolecules.

The present invention also provides isolated nucleic acids that containa single or double stranded fragment or portion that hybridizes understringent conditions to the nucleotide sequence of SEQ ID NO:20 or thecomplement of SEQ ID NO:20. In one embodiment, the nucleic acid consistsof a portion of the nucleotide sequence of SEQ ID NO:20 and thecomplement of SEQ ID NO:20. The nucleic acid fragments of the inventionare at least about 15, preferably at least about 18, 20, 23 or 25nucleotides, and can be 30, 40, 50, 100, 200, 500 or more nucleotides inlength. Longer fragments, for example, 30 or more nucleotides in length,which encode antigenic proteins or polypeptides described herein areuseful.

Furthermore, the invention provides polynucleotides that comprise afragment of the full-length adenylate cyclase polynucleotide. Thefragment can be single or double-stranded and can comprise DNA or RNA.The fragment can be derived from either the coding or the non-codingsequence.

In another embodiment an isolated adenylate cyclase nucleic acid encodesthe entire coding region. In another embodiment the isolated adenylatecyclase nucleic acid encodes a sequence corresponding to the matureprotein that may be from about amino acid 6 to the last amino acid.Other fragments include nucleotide sequences encoding the amino acidfragments described herein.

Thus, adenylate cyclase nucleic acid fragments further include sequencescorresponding to the domains described herein, subregions alsodescribed, and specific functional sites. Adenylate cyclase nucleic acidfragments also include combinations of the domains, segments, and otherfunctional sites described above. A person of ordinary skill in the artwould be aware of the many permutations that are possible.

Where the location of the domains or sites have been predicted bycomputer analysis, one of ordinary skill would appreciate that the aminoacid residues constituting these domains can vary depending on thecriteria used to define the domains.

However, it is understood that a adenylate cyclase fragment includes anynucleic acid sequence that does not include the entire gene.

The invention also provides adenylate cyclase nucleic acid fragmentsthat encode epitope bearing regions of the adenylate cyclase proteinsdescribed herein.

Computer Readable Means

The nucleotide or amino acid sequences of the invention are alsoprovided in a variety of mediums to facilitate use thereof. As usedherein, “provided” refers to a manufacture, other than an isolatednucleic acid or amino acid molecule, which contains a nucleotide oramino acid sequence of the present invention. Such a manufactureprovides the nucleotide or amino acid sequences, or a subset thereof(e.g., a subset of open reading frames (ORFs)) in a form which allows askilled artisan to examine the manufacture using means not directlyapplicable to examining the nucleotide or amino acid sequences, or asubset thereof, as they exists in nature or in purified form.

In one application of this embodiment, a nucleotide or amino acidsequence of the present invention can be recorded on computer readablemedia. As used herein, “computer readable media” refers to any mediumthat can be read and accessed directly by a computer. Such mediainclude, but are not limited to: magnetic storage media, such as floppydiscs, hard disc storage medium, and magnetic tape; optical storagemedia such as CD-ROM; electrical storage media such as RAM and ROM; andhybrids of these categories such as magnetic/optical storage media. Theskilled artisan will readily appreciate how any of the presently knowncomputer readable mediums can be used to create a manufacture comprisingcomputer readable medium having recorded thereon a nucleotide or aminoacid sequence of the present invention.

As used herein, “recorded” refers to a process for storing informationon computer readable medium. The skilled artisan can readily adopt anyof the presently known methods for recording information on computerreadable medium to generate manufactures comprising the nucleotide oramino acid sequence information of the present invention.

A variety of data storage structures are available to a skilled artisanfor creating a computer readable medium having recorded thereon anucleotide or amino acid sequence of the present invention. The choiceof the data storage structure will generally be based on the meanschosen to access the stored information. In addition, a variety of dataprocessor programs and formats can be used to store the nucleotidesequence information of the present invention on computer readablemedium. The sequence information can be represented in a word processingtext file, formatted in commercially-available software such asWordPerfect and Microsoft Word, or represented in the form of an ASCIIfile, stored in a database application, such as DB2, Sybase, Oracle, orthe like. The skilled artisan can readily adapt any number ofdataprocessor structuring formats (e.g., text file or database) in orderto obtain computer readable medium having recorded thereon thenucleotide sequence information of the present invention.

By providing the nucleotide or amino acid sequences of the invention incomputer readable form, the skilled artisan can routinely access thesequence information for a variety of purposes. For example, one skilledin the art can use the nucleotide or amino acid sequences of theinvention in computer readable form to compare a target sequence ortarget structural motif with the sequence information stored within thedata storage means. Search means are used to identify fragments orregions of the sequences of the invention which match a particulartarget sequence or target motif.

As used herein, a “target sequence” can be any DNA or amino acidsequence of six or more nucleotides or two or more amino acids. Askilled artisan can readily recognize that the longer a target sequenceis, the less likely a target sequence will be present as a randomoccurrence in the database. The most preferred sequence length of atarget sequence is from about 10 to 100 amino acids or from about 30 to300 nucleotide residues. However, it is well recognized thatcommercially important fragments, such as sequence fragments involved ingene expression and protein processing, may be of shorter length.

As used herein, “a target structural motif,” or “target motif,” refersto any rationally selected sequence or combination of sequences in whichthe sequence(s) are chosen based on a three-dimensional configurationwhich is formed upon the folding of the target motif. There are avariety of target motifs known in the art. Protein target motifsinclude, but are not limited to, enzyme active sites and signalsequences. Nucleic acid target motifs include, but are not limited to,promoter sequences, hairpin structures and inducible expression elements(protein binding sequences).

Computer software is publicly available which allows a skilled artisanto access sequence information provided in a computer readable mediumfor analysis and comparison to other sequences. A variety of knownalgorithms are disclosed publicly and a variety of commerciallyavailable software for conducting search means are and can be used inthe computer-based systems of the present invention. Examples of suchsoftware include, but are not limited to, MacPattern (EMBL), BLASTN andBLASTX (NCBIA).

For example, software which implements the BLAST (Altschul et al. (1990)J. Mol. Biol. 215:403-410) and BLAZE (Brutlag et al. (1993) Comp. Chem.17:203-207) search algorithms on a Sybase system can be used to identifyopen reading frames (ORFs) of the sequences of the invention whichcontain homology to ORFs or proteins from other libraries. Such ORFs areprotein encoding fragments and are useful in producing commerciallyimportant proteins such as enzymes used in various reactions and in theproduction of commercially useful metabolites.

Methods Using Vectors and Host Cells

The methods using vectors and host cells are particularly relevant wherevectors are expressed in the cells, tissues, and disorders shown inFIGS. 50 and 51, and otherwise discussed herein, or where the host cellsare those that naturally express the gene, as shown in these figures andwhich may be the native or a recombinant cell expressing the gene.

It is understood that “host cells” and “recombinant host cells” refernot only to the particular subject cell but also to the progeny orpotential progeny of such a cell. Because certain modifications mayoccur in succeeding generations due to either mutation or environmentalinfluences, such progeny may not, in fact, be identical to the parentcell, but are still included within the scope of the term as usedherein.

The host cells expressing the polypeptides described herein, andparticularly recombinant host cells, have a variety of uses. First, thecells are useful for producing adenylate cyclase proteins orpolypeptides that can be further purified to produce desired amounts ofadenylate cyclase protein or fragments. Thus, host cells containingexpression vectors are useful for polypeptide production, as well ascells producing significant amounts of the polypeptide, for example, thehigh-expressers shown in FIG. 51, in other words, testes, prostate,skeletal muscle and brain.

Host cells are also useful for conducting cell-based assays involvingthe adenylate cyclase or adenylate cyclase fragments. Thus, arecombinant host cell expressing a native adenylate cyclase is useful toassay for compounds that stimulate or inhibit adenylate cyclasefunction. This includes ATP or GTP binding, gene expression at the levelof transcription or translation, G-protein interaction, and componentsof the signal transduction pathway.

Host cells are also useful for identifying adenylate cyclase mutants inwhich these functions are affected. If the mutants naturally occur andgive rise to a pathology, host cells containing the mutations are usefulto assay compounds that have a desired effect on the mutant adenylatecyclase (for example, stimulating or inhibiting function) which may notbe indicated by their effect on the native adenylate cyclase.

Recombinant host cells are also useful for expressing the chimericpolypeptides described herein to assess compounds that activate orsuppress activation by means of a heterologous domain, segment, site,and the like, as disclosed herein.

Further, mutant adenylate cyclases can be designed in which one or moreof the various functions is engineered to be increased or decreased(e.g., ATP binding or G-protein binding) and used to augment or replaceadenylate cyclase proteins in an individual. Thus, host cells canprovide a therapeutic benefit by replacing an aberrant adenylate cyclaseor providing an aberrant adenylate cyclase that provides a therapeuticresult. In one embodiment, the cells provide adenylate cyclases that areabnormally active.

In another embodiment, the cells provide a adenylate cyclase that isabnormally inactive. This adenylate cyclase can compete with endogenousadenylate cyclase in the individual.

In another embodiment, cells expressing adenylate cyclases that cannotbe activated are introduced into an individual in order to compete withendogenous adenylate cyclase for ATP. For example, in the case in whichexcessive ATP is part of a treatment modality, it may be necessary toinactivate this molecule at a specific point in treatment. Providingcells that compete for the molecule, but which cannot be affected byadenylate cyclase activation would be beneficial.

Homologously recombinant host cells can also be produced that allow thein situ alteration of endogenous adenylate cyclase polynucleotidesequences in a host cell genome. The host cell includes, but is notlimited to, a stable cell line, cell in vivo, or cloned microorganism.This technology is more fully described in WO 93/09222, WO 91/12650, WO91/06667, U.S. Pat. No. 5,272,071, and U.S. Pat. No. 5,641,670. Briefly,specific polynucleotide sequences corresponding to the adenylate cyclasepolynucleotides or sequences proximal or distal to a adenylate cyclasegene are allowed to integrate into a host cell genome by homologousrecombination where expression of the gene can be affected. In oneembodiment, regulatory sequences are introduced that either increase ordecrease expression of an endogenous sequence. Accordingly, a adenylatecyclase protein can be produced in a cell not normally producing it.Alternatively, increased expression of adenylate cyclase protein can beeffected in a cell normally producing the protein at a specific level.Further, expression can be decreased or eliminated by introducing aspecific regulatory sequence. The regulatory sequence can beheterologous to the adenylate cyclase protein sequence or can be ahomologous sequence with a desired mutation that affects expression.Alternatively, the entire gene can be deleted. The regulatory sequencecan be specific to the host cell or capable of functioning in more thanone cell type. Still further, specific mutations can be introduced intoany desired region of the gene to produce mutant adenylate cyclaseproteins. Such mutations could be introduced, for example, into thespecific functional regions such as the nucleotide triphosphate site.

In one embodiment, the host cell can be a fertilized oocyte or embryonicstem cell that can be used to produce a transgenic animal containing thealtered adenylate cyclase gene. Alternatively, the host cell can be astem cell or other early tissue precursor that gives rise to a specificsubset of cells and can be used to produce transgenic tissues in ananimal. See also Thomas et al., Cell 51:503 (1987) for a description ofhomologous recombination vectors. The vector is introduced into anembryonic stem cell line (e.g., by electroporation) and cells in whichthe introduced gene has homologously recombined with the endogenousadenylate cyclase gene is selected (see e.g., Li, E. et al. (1992) Cell69:915). The selected cells are then injected into a blastocyst of ananimal (e.g., a mouse) to form aggregation chimeras (see e.g., Bradley,A. in Teratocarcinomas and Embryonic Stem Cells: A Practical Approach,E. J. Robertson, ed. (IRL, Oxford, 1987) pp. 113-152). A chimeric embryocan then be implanted into a suitable pseudopregnant female fosteranimal and the embryo brought to term. Progeny harboring thehomologously recombined DNA in their germ cells can be used to breedanimals in which all cells of the animal contain the homologouslyrecombined DNA by germline transmission of the transgene. Methods forconstructing homologous recombination vectors and homologous recombinantanimals are described further in Bradley, A. (1991) Current Opinions inBiotechnology 2:823-829 and in PCT International Publication Nos. WO90/11354; WO 91/01140; and WO 93/04169.

The genetically engineered host cells can be used to produce non-humantransgenic animals. A transgenic animal is preferably a mammal, forexample a rodent, such as a rat or mouse, in which one or more of thecells of the animal include a transgene. A transgene is exogenous DNAwhich is integrated into the genome of a cell from which a transgenicanimal develops and which remains in the genome of the mature animal inone or more cell types or tissues of the transgenic animal. Theseanimals are useful for studying the function of an adenylate cyclaseprotein and identifying and evaluating modulators of adenylate cyclaseprotein activity.

Other examples of transgenic animals include non-human primates, sheep,dogs, cows, goats, chickens, and amphibians.

In one embodiment, a host cell is a fertilized oocyte or an embryonicstem cell into which adenylate cyclase polynucleotide sequences havebeen introduced.

A transgenic animal can be produced by introducing nucleic acid into themale pronuclei of a fertilized oocyte, e.g., by microinjection,retroviral infection, and allowing the oocyte to develop in apseudopregnant female foster animal. Any of the adenylate cyclasenucleotide sequences can be introduced as a transgene into the genome ofa non-human animal, such as a mouse.

Any of the regulatory or other sequences useful in expression vectorscan form part of the transgenic sequence. This includes intronicsequences and polyadenylation signals, if not already included. Atissue-specific regulatory sequence(s) can be operably linked to thetransgene to direct expression of the adenylate cyclase protein toparticular cells.

Methods for generating transgenic animals via embryo manipulation andmicroinjection, particularly animals such as mice, have becomeconventional in the art and are described, for example, in U.S. Pat.Nos. 4,736,866 and 4,870,009, both by Leder et al., U.S. Pat. No.4,873,191 by Wagner et al. and in Hogan, B., Manipulating the MouseEmbryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.,1986). Similar methods are used for production of other transgenicanimals. A transgenic founder animal can be identified based upon thepresence of the transgene in its genome and/or expression of transgenicmRNA in tissues or cells of the animals. A transgenic founder animal canthen be used to breed additional animals carrying the transgene.Moreover, transgenic animals carrying a transgene can further be bred toother transgenic animals carrying other transgenes. A transgenic animalalso includes animals in which the entire animal or tissues in theanimal have been produced using the homologously recombinant host cellsdescribed herein.

In another embodiment, transgenic non-human animals can be producedwhich contain selected systems, which allow for regulated expression ofthe transgene. One example of such a system is the cre/loxP recombinasesystem of bacteriophage PI. For a description of the cre/loxPrecombinase system, see, e.g., Lakso et al. (1992) PNAS 89:6232-6236.Another example of a recombinase system is the FLP recombinase system ofS. cerevisiae (O'Gorman et al. (1991) Science 251:1351-1355. If acre/loxP recombinase system is used to regulate expression of thetransgene, animals containing transgenes encoding both the Crerecombinase and a selected protein is required. Such animals can beprovided through the construction of “double” transgenic animals, e.g.,by mating two transgenic animals, one containing a transgene encoding aselected protein and the other containing a transgene encoding arecombinase.

Clones of the non-human transgenic animals described herein can also beproduced according to the methods described in Wilmut et al. (1997)Nature 385:810-813 and PCT International Publication Nos. WO 97/07668and WO 97/07669. In brief, a cell, e.g., a somatic cell, from thetransgenic animal can be isolated and induced to exit the growth cycleand enter G₀ phase. The quiescent cell can then be fused, e.g., throughthe use of electrical pulses, to an enucleated oocyte from an animal ofthe same species from which the quiescent cell is isolated. Thereconstructed oocyte is then cultured such that it develops to morula orblastocyst and then transferred to a pseudopregnant female fosteranimal. The offspring born of this female foster animal will be a cloneof the animal from which the cell, e.g., the somatic cell, is isolated.

Transgenic animals containing recombinant cells that express thepolypeptides described herein are useful to conduct the assays describedherein in an in vivo context. Accordingly, the various physiologicalfactors that are present in vivo and that could affect cAMP binding,adenylate cyclase activation, and signal transduction, may not beevident from in vitro cell-free or cell-based assays. Accordingly, it isuseful to provide non-human transgenic animals to assay in vivoadenylate cyclase function, including ATP interaction, the effect ofspecific mutant adenylate cyclases on adenylate cyclase function and ATPinteraction, and the effect of chimeric adenylate cyclases. It is alsopossible to assess the effect of null mutations, that is mutations thatsubstantially or completely eliminate one or more adenylate cyclasefunctions.

In general, methods for producing transgenic animals include introducinga nucleic acid sequence according to the present invention, the nucleicacid sequence capable of expressing the protein in a transgenic animal,into a cell in culture or in vivo. When introduced in vivo, the nucleicacid is introduced into an intact organism such that one or more celltypes and, accordingly, one or more tissue types, express the nucleicacid encoding the protein. Alternatively, the nucleic acid can beintroduced into virtually all cells in an organism by transfecting acell in culture, such as an embryonic stem cell, as described herein forthe production of transgenic animals, and this cell can be used toproduce an entire transgenic organism. As described, in a furtherembodiment, the host cell can be a fertilized oocyte. Such cells arethen allowed to develop in a female foster animal to produce thetransgenic organism.

Vectors/Host Cells

The methods using the vectors and host cells discussed above are basedon the vectors and host cells including, but not limited to, thosedescribed below.

The invention also provides methods using vectors containing theadenylate cyclase polynucleotides. The term “vector” refers to avehicle, preferably a nucleic acid molecule that can transport theadenylate cyclase polynucleotides. When the vector is a nucleic acidmolecule, the adenylate cyclase polynucleotides are covalently linked tothe vector nucleic acid. With this aspect of the invention, the vectorincludes a plasmid, single or double stranded phage, a single or doublestranded RNA or DNA viral vector, or artificial chromosome, such as aBAC, PAC, YAC, OR MAC.

A vector can be maintained in the host cell as an extrachromosomalelement where it replicates and produces additional copies of theadenylate cyclase polynucleotides. Alternatively, the vector mayintegrate into the host cell genome and produce additional copies of theadenylate cyclase polynucleotides when the host cell replicates.

The invention provides vectors for the maintenance (cloning vectors) orvectors for expression (expression vectors) of the adenylate cyclasepolynucleotides. The vectors can function in procaryotic or eukaryoticcells or in both (shuttle vectors).

Expression vectors contain cis-acting regulatory regions that areoperably linked in the vector to the adenylate cyclase polynucleotidessuch that transcription of the polynucleotides is allowed in a hostcell. The polynucleotides can be introduced into the host cell with aseparate polynucleotide capable of affecting transcription. Thus, thesecond polynucleotide may provide a trans-acting factor interacting withthe cis-regulatory control region to allow transcription of theadenylate cyclase polynucleotides from the vector. Alternatively, atrans-acting factor may be supplied by the host cell. Finally, atrans-acting factor can be produced from the vector itself.

It is understood, however, that in some embodiments, transcriptionand/or translation of the adenylate cyclase polynucleotides can occur ina cell-free system.

The regulatory sequence to which the polynucleotides described hereincan be operably linked include promoters for directing mRNAtranscription. These include, but are not limited to, the left promoterfrom bacteriophage λ, the lac, TRP, and TAC promoters from E. coli, theearly and late promoters from SV40, the CMV immediate early promoter,the adenovirus early and late promoters, and retrovirus long-terminalrepeats.

In addition to control regions that promote transcription, expressionvectors may also include regions that modulate transcription, such asrepressor binding sites and enhancers. Examples include the SV40enhancer, the cytomegalovirus immediate early enhancer, polyomaenhancer, adenovirus enhancers, and retrovirus LTR enhancers.

In addition to containing sites for transcription initiation andcontrol, expression vectors can also contain sequences necessary fortranscription termination and, in the transcribed region a ribosomebinding site for translation. Other regulatory control elements forexpression include initiation and termination codons as well aspolyadenylation signals. The person of ordinary skill in the art wouldbe aware of the numerous regulatory sequences that are useful inexpression vectors. Such regulatory sequences are described, forexample, in Sambrook et al. (1989) Molecular Cloning: A LaboratoryManual 2nd. ed., Cold Spring Harbor Laboratory Press, Cold SpringHarbor, N.Y.).

A variety of expression vectors can be used to express a adenylatecyclase polynucleotide. Such vectors include chromosomal, episomal, andvirus-derived vectors, for example vectors derived from bacterialplasmids, from bacteriophage, from yeast episomes, from yeastchromosomal elements, including yeast artificial chromosomes, fromviruses such as baculoviruses, papovaviruses such as SV40, Vacciniaviruses, adenoviruses, poxviruses, pseudorabies viruses, andretroviruses. Vectors may also be derived from combinations of thesesources such as those derived from plasmid and bacteriophage geneticelements, e.g., cosmids and phagemids. Appropriate cloning andexpression vectors for prokaryotic and eukaryotic hosts are described inSambrook et al. (1989) Molecular Cloning: A Laboratory Manual 2nd. ed.,Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.

The regulatory sequence may provide constitutive expression in one ormore host cells (i.e., tissue specific) or may provide for inducibleexpression in one or more cell types such as by temperature, nutrientadditive, or exogenous factor such as a hormone or other ligand. Avariety of vectors providing for constitutive and inducible expressionin prokaryotic and eukaryotic hosts are well known to those of ordinaryskill in the art.

The adenylate cyclase polynucleotides can be inserted into the vectornucleic acid by well-known methodology. Generally, the DNA sequence thatwill ultimately be expressed is joined to an expression vector bycleaving the DNA sequence and the expression vector with one or morerestriction enzymes and then ligating the fragments together. Proceduresfor restriction enzyme digestion and ligation are well known to those ofordinary skill in the art.

The vector containing the appropriate polynucleotide can be introducedinto an appropriate host cell for propagation or expression usingwell-known techniques. Bacterial cells include, but are not limited to,E. coli, Streptomyces, and Salmonella typhimurium. Eukaryotic cellsinclude, but are not limited to, yeast, insect cells such as Drosophila,animal cells such as COS and CHO cells, and plant cells.

As described herein, it may be desirable to express the polypeptide as afusion protein. Accordingly, the invention provides fusion vectors thatallow for the production of the adenylate cyclase polypeptides. Fusionvectors can increase the expression of a recombinant protein, increasethe solubility of the recombinant protein, and aid in the purificationof the protein by acting for example as a ligand for affinitypurification. A proteolytic cleavage site may be introduced at thejunction of the fusion moiety so that the desired polypeptide canultimately be separated from the fusion moiety. Proteolytic enzymesinclude, but are not limited to, factor Xa, thrombin, and enterokinase.Typical fusion expression vectors include pGEX (Smith et al. (1988) Gene67:31-40), pMAL (New England Biolabs, Beverly, Mass.) and pRIT5(Pharmacia, Piscataway, N.J.) which fuse glutathione S-transferase(GST), maltose E binding protein, or protein A, respectively, to thetarget recombinant protein. Examples of suitable inducible non-fusion E.coli expression vectors include pTrc (Amann et al. (1988) Gene69:301-315) and pET 11d (Studier et al. (1990) Gene ExpressionTechnology: Methods in Enzymology 185:60-89).

Recombinant protein expression can be maximized in a host bacteria byproviding a genetic background wherein the host cell has an impairedcapacity to proteolytically cleave the recombinant protein. (Gottesman,S. (1990) Gene Expression Technology: Methods in Enzymology 185,Academic Press, San Diego, Calif. 119-128). Alternatively, the sequenceof the polynucleotide of interest can be altered to provide preferentialcodon usage for a specific host cell, for example E. coli. (Wada et al.(1992) Nucleic Acids Res. 20:2111-2118).

The adenylate cyclase polynucleotides can also be expressed byexpression vectors that are operative in yeast. Examples of vectors forexpression in yeast e.g., S. cerevisiae include pYepSec1 (Baldari et al.(1987) EMBO J. 6:229-234), pMFa (Kurjan et al. (1982) Cell 30:933-943),pJRY88 (Schultz et al. (1987) Gene 54:113-123), and pYES2 (invitrogenCorporation, San Diego, Calif.).

The adenylate cyclase polynucleotides can also be expressed in insectcells using, for example, baculovirus expression vectors. Baculovirusvectors available for expression of proteins in cultured insect cells(e.g., Sf9 cells) include the pAc series (Smith et al. (1983) Mol. Cell.Biol. 3:2156-2165) and the pVL series (Lucklow et al. (1989) Virology170:31-39).

In certain embodiments of the invention, the polynucleotides describedherein are expressed in mammalian cells using mammalian expressionvectors. Examples of mammalian expression vectors include pCDM8 (Seed,B. (1987) Nature 329:840) and pMT2PC (Kaufman et al. (1987) EMBO J.6:187-195).

The expression vectors listed herein are provided by way of example onlyof the well-known vectors available to those of ordinary skill in theart that would be useful to express the adenylate cyclasepolynucleotides. The person of ordinary skill in the art would be awareof other vectors suitable for maintenance propagation or expression ofthe polynucleotides described herein. These are found for example inSambrook et al. (1989) Molecular Cloning: A Laboratory Manual 2nd, ed.,Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory Press, ColdSpring Harbor, N.Y.

The invention also encompasses vectors in which the nucleic acidsequences described herein are cloned into the vector in reverseorientation, but operably linked to a regulatory sequence that permitstranscription of antisense RNA. Thus, an antisense transcript can beproduced to all, or to a portion, of the polynucleotide sequencesdescribed herein, including both coding and non-coding regions.Expression of this antisense RNA is subject to each of the parametersdescribed above in relation to expression of the sense RNA (regulatorysequences, constitutive or inducible expression, tissue-specificexpression).

The invention also relates to recombinant host cells containing thevectors described herein. Host cells therefore include prokaryoticcells, lower eukaryotic cells such as yeast, other eukaryotic cells suchas insect cells, and higher eukaryotic cells such as, mammalian cells.

The recombinant host cells are prepared by introducing the vectorconstructs described herein into the cells by techniques readilyavailable to the person of ordinary skill in the art. These include, butare not limited to, calcium phosphate transfection,DEAE-dextran-mediated transfection, cationic lipid-mediatedtransfection, electroporation, transduction, infection, lipofection, andother techniques such as those found in Sambrook et al. (MolecularCloning: A Laboratory Manual, 2d ed., Cold Spring Harbor Laboratory,Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.).

Host cells can contain more than one vector. Thus, different nucleotidesequences can be introduced on different vectors of the same cell.Similarly, the adenylate cyclase polynucleotides can be introducedeither alone or with other polynucleotides that are not related to theadenylate cyclase polynucleotides such as those providing trans-actingfactors for expression vectors. When more than one vector is introducedinto a cell, the vectors can be introduced independently, co-introducedor joined to the adenylate cyclase polynucleotide vector.

In the case of bacteriophage and viral vectors, these can be introducedinto cells as packaged or encapsulated virus by standard procedures forinfection and transduction. Viral vectors can be replication-competentor replication-defective. In the case in which viral replication isdefective, replication will occur in host cells providing functions thatcomplement the defects.

Vectors generally include selectable markers that enable the selectionof the subpopulation of cells that contain the recombinant vectorconstructs. The marker can be contained in the same vector that containsthe polynucleotides described herein or may be on a separate vector.Markers include tetracycline or ampicillin-resistance genes forprokaryotic host cells and dihydrofolate reductase or neomycinresistance for eukaryotic host cells. However, any marker that providesselection for a phenotypic trait will be effective.

While the mature proteins can be produced in bacteria, yeast, mammaliancells, and other cells under the control of the appropriate regulatorysequences, cell-free transcription and translation systems can also beused to produce these proteins using RNA derived from the DNA constructsdescribed herein.

Where secretion of the polypeptide is desired, appropriate secretionsignals are incorporated into the vector. The signal sequence can beendogenous to the adenylate cyclase polypeptides or heterologous tothese polypeptides.

Where the polypeptide is not secreted into the medium, the protein canbe isolated from the host cell by standard disruption procedures,including freeze thaw, sonication, mechanical disruption, use of lysingagents and the like. The polypeptide can then be recovered and purifiedby well-known purification methods including ammonium sulfateprecipitation, acid extraction, anion or cationic exchangechromatography, phosphocellulose chromatography, hydrophobic-interactionchromatography, affinity chromatography, hydroxylapatite chromatography,lectin chromatography, or high performance liquid chromatography.

It is also understood that depending upon the host cell in recombinantproduction of the polypeptides described herein, the polypeptides canhave various glycosylation patterns, depending upon the cell, or maybenon-glycosylated as when produced in bacteria. In addition, thepolypeptides may include an initial modified methionine in some cases asa result of a host-mediated process.

Pharmaceutical Compositions

The invention encompasses use of the polypeptides, nucleic acids, andother agents in pharmaceutical compositions to administer to the cellsin which expression of the adenylate cyclase is relevant and indisorders as disclosed herein. Uses are both diagnostic and therapeutic.The adenylate cyclase nucleic acid molecules, protein, modulators of theprotein, and antibodies (also referred to herein as “active compounds”)can be incorporated into pharmaceutical compositions suitable foradministration to a subject, e.g., a human. Such compositions typicallycomprise the nucleic acid molecule, protein, modulator, or antibody anda pharmaceutically acceptable carrier. It is understood however, thatadministration can also be to cells in vitro as well as to in vivo modelsystems such as non-human transgenic animals.

The term “administer” is used in its broadest sense and includes anymethod of introducing the compositions of the present invention into asubject. This includes producing polypeptides or polynucleotides in vivoas by transcription or translation, in vivo, of polynucleotides thathave been exogenously introduced into a subject. Thus, polypeptides ornucleic acids produced in the subject from the exogenous compositionsare encompassed in the term “administer.”

As used herein the language “pharmaceutically acceptable carrier” isintended to include any and all solvents, dispersion media, coatings,antibacterial and antifungal agents, isotonic and absorption delayingagents, and the like, compatible with pharmaceutical administration. Theuse of such media and agents for pharmaceutically active substances iswell known in the art. Except insofar as any conventional media or agentis incompatible with the active compound, such media can be used in thecompositions of the invention. Supplementary active compounds can alsobe incorporated into the compositions. A pharmaceutical composition ofthe invention is formulated to be compatible with its intended route ofadministration. Examples of routes of administration include parenteral,e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation),transdermal (topical), transmucosal, and rectal administration.Solutions or suspensions used for parenteral, intradermal, orsubcutaneous application can include the following components: a sterilediluent such as water for injection, saline solution, fixed oils,polyethylene glycols, glycerine, propylene glycol or other syntheticsolvents; antibacterial agents such as benzyl alcohol or methylparabens; antioxidants such as ascorbic acid or sodium bisulfite;chelating agents such as ethylenediaminetetraacetic acid; buffers suchas acetates, citrates or phosphates and agents for the adjustment oftonicity such as sodium chloride or dextrose. pH can be adjusted withacids or bases, such as hydrochloric acid or sodium hydroxide. Theparenteral preparation can be enclosed in ampules, disposable syringesor multiple dose vials made of glass or plastic.

Pharmaceutical compositions suitable for injectable use include sterileaqueous solutions (where water soluble) or dispersions and sterilepowders for the extemporaneous preparation of sterile injectablesolutions or dispersion. For intravenous administration, suitablecarriers include physiological saline, bacteriostatic water, CremophorEL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In allcases, the composition must be sterile and should be fluid to the extentthat easy syringability exists. It must be stable under the conditionsof manufacture and storage and must be preserved against thecontaminating action of microorganisms such as bacteria and fungi. Thecarrier can be a solvent or dispersion medium containing, for example,water, ethanol, polyol (for example, glycerol, propylene glycol, andliquid polyethylene glycol, and the like), and suitable mixturesthereof. The proper fluidity can be maintained, for example, by the useof a coating such as lecithin, by the maintenance of the requiredparticle size in the case of dispersion and by the use of surfactants.Prevention of the action of microorganisms can be achieved by variousantibacterial and antifungal agents, for example, parabens,chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In manycases, it will be preferable to include isotonic agents, for example,sugars, polyalcohols such as mannitol, sorbitol, sodium chloride in thecomposition. Prolonged absorption of the injectable compositions can bebrought about by including in the composition an agent which delaysabsorption, for example, aluminum monostearate and gelatin.

Sterile injectable solutions can be prepared by incorporating the activecompound (e.g., a adenylate cyclase protein or anti-adenylate cyclaseantibody) in the required amount in an appropriate solvent with one or acombination of ingredients enumerated above, as required, followed byfiltered sterilization. Generally, dispersions are prepared byincorporating the active compound into a sterile vehicle which containsa basic dispersion medium and the required other ingredients from thoseenumerated above. In the case of sterile powders for the preparation ofsterile injectable solutions, the preferred methods of preparation arevacuum drying and freeze-drying which yields a powder of the activeingredient plus any additional desired ingredient from a previouslysterile-filtered solution thereof.

Oral compositions generally include an inert diluent or an ediblecarrier. They can be enclosed in gelatin capsules or compressed intotablets. For oral administration, the agent can be contained in entericforms to survive the stomach or further coated or mixed to be releasedin a particular region of the GI tract by known methods. For the purposeof oral therapeutic administration, the active compound can beincorporated with excipients and used in the form of tablets, troches,or capsules. Oral compositions can also be prepared using a fluidcarrier for use as a mouthwash, wherein the compound in the fluidcarrier is applied orally and swished and expectorated or swallowed.Pharmaceutically compatible binding agents, and/or adjuvant materialscan be included as part of the composition. The tablets, pills,capsules, troches and the like can contain any of the followingingredients, or compounds of a similar nature: a binder such asmicrocrystalline cellulose, gum tragacanth or gelatin; an excipient suchas starch or lactose, a disintegrating agent such as alginic acid,Primogel, or corn starch; a lubricant such as magnesium stearate orSterotes; a glidant such as colloidal silicon dioxide; a sweeteningagent such as sucrose or saccharin; or a flavoring agent such aspeppermint, methyl salicylate, or orange flavoring.

For administration by inhalation, the compounds are delivered in theform of an aerosol spray from pressured container or dispenser, whichcontains a suitable propellant, e.g., a gas such as carbon dioxide, or anebulizer.

Systemic administration can also be by transmucosal or transdermalmeans. For transmucosal or transdermal administration, penetrantsappropriate to the barrier to be permeated are used in the formulation.Such penetrants are generally known in the art, and include, forexample, for transmucosal administration, detergents, bile salts, andfusidic acid derivatives. Transmucosal administration can beaccomplished through the use of nasal sprays or suppositories. Fortransdermal administration, the active compounds are formulated intoointments, salves, gels, or creams as generally known in the art.

The compounds can also be prepared in the form of suppositories (e.g.,with conventional suppository bases such as cocoa butter and otherglycerides) or retention enemas for rectal delivery.

In one embodiment, the active compounds are prepared with carriers thatwill protect the compound against rapid elimination from the body, suchas a controlled release formulation, including implants andmicroencapsulated delivery systems. Biodegradable, biocompatiblepolymers can be used, such as ethylene vinyl acetate, polyanhydrides,polyglycolic acid, collagen, polyorthoesters, and polylactic acid.Methods for preparation of such formulations will be apparent to thoseskilled in the art. The materials can also be obtained commercially fromAlza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions(including liposomes targeted to infected cells with monoclonalantibodies to viral antigens) can also be used as pharmaceuticallyacceptable carriers. These can be prepared according to methods known tothose skilled in the art, for example, as described in U.S. Pat. No.4,522,811.

It is especially advantageous to formulate oral or parenteralcompositions in dosage unit form for ease of administration anduniformity of dosage. “Dosage unit form” as used herein refers tophysically discrete units suited as unitary dosages for the subject tobe treated; each unit containing a predetermined quantity of activecompound calculated to produce the desired therapeutic effect inassociation with the required pharmaceutical carrier. The specificationfor the dosage unit forms of the invention are dictated by and directlydependent on the unique characteristics of the active compound and theparticular therapeutic effect to be achieved, and the limitationsinherent in the art of compounding such an active compound for thetreatment of individuals.

The nucleic acid molecules of the invention can be inserted into vectorsand used as gene therapy vectors. Gene therapy vectors can be deliveredto a subject by, for example, intravenous injection, localadministration (U.S. Pat. No. 5,328,470) or by stereotactic injection(see e.g., Chen et al. (1994) PNAS 91:3054-3057). The pharmaceuticalpreparation of the gene therapy vector can include the gene therapyvector in an acceptable diluent, or can comprise a slow release matrixin which the gene delivery vehicle is imbedded. Alternatively, where thecomplete gene delivery vector can be produced intact from recombinantcells, e.g. retroviral vectors, the pharmaceutical preparation caninclude one or more cells which produce the gene delivery system.

The pharmaceutical compositions can be included in a container, pack, ordispenser together with instructions for administration.

As defined herein, a therapeutically effective amount of protein orpolypeptide (i.e., an effective dosage) ranges from about 0.001 to 30mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, morepreferably about 0.1 to 20 mg/kg body weight, and even more preferablyabout 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6mg/kg body weight.

The skilled artisan will appreciate that certain factors may influencethe dosage required to effectively treat a subject, including but notlimited to the severity of the disease or disorder, previous treatments,the general health and/or age of the subject, and other diseasespresent. Moreover, treatment of a subject with a therapeuticallyeffective amount of a protein, polypeptide, or antibody can include asingle treatment or, preferably, can include a series of treatments. Ina preferred example, a subject is treated with antibody, protein, orpolypeptide in the range of between about 0.1 to 20 mg/kg body weight,one time per week for between about 1 to 10 weeks, preferably between 2to 8 weeks, more preferably between about 3 to 7 weeks, and even morepreferably for about 4, 5, or 6 weeks. It will also be appreciated thatthe effective dosage of antibody, protein, or polypeptide used fortreatment may increase or decrease over the course of a particulartreatment. Changes in dosage may result and become apparent from theresults of diagnostic assays as described herein.

The present invention encompasses agents which modulate expression oractivity. An agent may, for example, be a small molecule. For example,such small molecules include, but are not limited to, peptides,peptidomimetics, amino acids, amino acid analogs, polynucleotides,polynucleotide analogs, nucleotides, nucleotide analogs, organic orinorganic compounds (i.e., including heteroorganic and organometalliccompounds) having a molecular weight less than about 10,000 grams permole, organic or inorganic compounds having a molecular weight less thanabout 5,000 grams per mole, organic or inorganic compounds having amolecular weight less than about 1,000 grams per mole, organic orinorganic compounds having a molecular weight less than about 500 gramsper mole, and salts, esters, and other pharmaceutically acceptable formsof such compounds.

It is understood that appropriate doses of small molecule agents dependsupon a number of factors within the ken of the ordinarily skilledphysician, veterinarian, or researcher. The dose(s) of the smallmolecule will vary, for example, depending upon the identity, size, andcondition of the subject or sample being treated, further depending uponthe route by which the composition is to be administered, if applicable,and the effect which the practitioner desires the small molecule to haveupon the nucleic acid or polypeptide of the invention. Exemplary dosesinclude milligram or microgram amounts of the small molecule perkilogram of subject or sample weight (e.g., about 1 microgram perkilogram to about 500 milligrams per kilogram, about 100 micrograms perkilogram to about 5 milligrams per kilogram, or about 1 microgram perkilogram to about 50 micrograms per kilogram. It is furthermoreunderstood that appropriate doses of a small molecule depend upon thepotency of the small molecule with respect to the expression or activityto be modulated. Such appropriate doses may be determined using theassays described herein. When one or more of these small molecules is tobe administered to an animal (e.g., a human) in order to modulateexpression or activity of a polypeptide or nucleic acid of theinvention, a physician, veterinarian, or researcher may, for example,prescribe a relatively low dose at first, subsequently increasing thedose until an appropriate response is obtained. In addition, it isunderstood that the specific dose level for any particular animalsubject will depend upon a variety of factors including the activity ofthe specific compound employed, the age, body weight, general health,gender, and diet of the subject, the time of administration, the routeof administration, the rate of excretion, any drug combination, and thedegree of expression or activity to be modulated.

This invention may be embodied in many different forms and should not beconstrued as limited to the embodiments set forth herein; rather, theseembodiments are provided so that this disclosure will fully convey theinvention to those skilled in the art. Many modifications and otherembodiments of the invention will come to mind in one skilled in the artto which this invention pertains having the benefit of the teachingspresented in the foregoing description. Although specific terms areemployed, they are used as in the art unless otherwise indicated.

CHAPTER 6 Novel Human GTPase Activator Proteins BACKGROUND OF THEINVENTION The Ras Superfamily of GTPases

Proteins regulating Ras and its relatives have been reviewed in Boguskiet al. (Nature 366:643-654 (1993)), summarized below. Ras proteins andtheir relatives are key in the control of normal and transformed cellgrowth. Small GTPases related to Ras control a wide variety of cellularprocesses which include aspects of growth and differentiation, controlof the cytoskeleton and regulation of cellular traffic between membranebound compartments. These proteins cycle between active and inactivestates bound to GTP and GDP. This cycling is influenced by three classesof proteins that switch the GTPase on, switch it off, and prevent itfrom switching. Further, the intracellular location of the GTPase can becontrolled by another class of regulatory protein. The GTP-bound form ofthe GTPase is converted to the GDP-bound form by an intrinsic capacityto hydrolyze GTP. This process is accelerated by a GTPase-activatingprotein (GAP). Activation involves the replacement of GDP with GTP. Thisevent is mediated by proteins designated guanine nucleotide exchangefactors (GEF) or guanine nucleotide releasing protein (GNRP) and guaninenucleotide dissociation stimulator (GDS). The process is inhibited byguanine nucleotide dissociation inhibitors (GDI). Further, membraneanchoring of the GTPase is critical for proper function and isregulated, among other enzymes, by prenyltransferases.

The Ras superfamily of GTPases can be roughly divided into three mainfamilies. The first family is the “true” Ras protein, each of which hasthe ability to function as an oncogene following mutational activation.These proteins transmit signals from tyrosine kinases at the plasmamembrane to a cascade of serine/threonine kinases, which deliver signalsto the cell nucleus. Constitutive activation of the pathway contributesto malignant transformation. The second group is the Rho/Rac proteinsubgroup, involved in organizing the cytoskeleton. Rac is required formembrane ruffling induced by growth factors and the formation of actinstress fibers requires Rho. In yeast, the CDC42 product controls cellpolarity, another process in which actin is involved. In addition, Racproteins are components of the NADPH oxidase system that generatessuperoxide in phagocytes. A third family is the Rab protein family.Members of this group regulate membrane trafficking, i.e., transport ofvesicles between different intracellular compartments.

In addition to the three major families, further subgroups exist,exemplified by Ran and Arf. Ran proteins are nuclear GTPases involved inmitosis. Arf (ADP-ribosylation factor) proteins are necessary forADP-ribosylation of G_(sa) (the GTPase subunit of s-type heterotrimericG-proteins) by cholera toxin and are thought to be involved in membranevesicle fusion and transport.

Ras GEFs are proteins that activate Ras proteins by exchanging bound GDPfor free GTP. These include Ras GRF, MmSosI, DnSoS, Step 6, Cdc25,Scd25, Lte1, and BUD5. The loss of GEF function can be complemented bymutations that constitutively activate the Ras proteins or, in somecases, by a loss of GAP activity. GEFs first associate with theGDP-bound form of the GTPase. GDP dissociates from this complex at anincreased rate leaving the GEF bound to the empty GTPase. GTP then bindsimmediately, effecting GEF dissociation and leaving the GTPase in activeform. Accordingly, a stable complex can exist between GEF and GTPase inthe absence of nucleotide. Thus, GEFs recognize both GDP and GTP-boundforms of Ras in vitro and in vivo.

Dominant negative Ras mutants exist that block normal Ras activation.These have reduced affinity for GTP and may be defective in the finalstep of the exchange process, i.e. displacement of GEF by GTP.Accordingly, these mutants sequester GEF into a dead-end complex and areuseful to remove GEF activity from cells so that activation ofendogenous Ras proteins cannot occur. However, Ras may also be activatedby inhibiting GAP activity without the need for GEF.

GEFs also include ral GEF. It is 20-fold more active on Ral A and Ral Bthan on members of the Ras, Rho/Rac and Rab GTPase families.

GEFs also include rap GEF. Cell polarity and budding in yeast involveGTPases of the Rap and Rho subgroup. A GEF specific for mammalian Rapproteins remains to be identified. Rap has the ability to interfere withRas signaling by blocking activation of RAF and the serine/threoninekinase cascade.

GEFs also include Rho/Rac GEFs. GEFs specific for Rac and Rho proteinsinclude, but are not limited to, Cdc24, Dbl, Vav, Bcr, Ras GRF, and ect2. The human Dbl has been shown to act as a GEF for CDC42Hs (the humanhomolog of CDC42 is known as G25K) and on Rho. Further, Dbl bindsseveral Rac/Rho-like proteins in vitro.

smg GDS (small GTP-binding protein) was originally described as a GEFfor mammalian Rap proteins. It also promotes nucleotide exchange on Rhoand Rac proteins. The protein works efficiently only on isoprenylatedproteins. Ras and Rho/Rac proteins are modified by different isoprenoidmoieties. Rho/Rac proteins receive 20-carbon geranylgeranyl groups.

Guanine nucleotide dissociation inhibitors (GDIs) include rab GDI. Theprotein affects the rate of GDP dissociation from Rab proteins. Itinhibits GDP/GTP exchange and prevents the GDP-bound form from bindingto membranes. These activities depend on the C-terminal geranylgeranylgroup, at least of Rab3A.

Rho GDI was first identified as a factor capable of inhibitingdissociation of GDP from post-translationally modified Rho proteins. Ithas the ability to remove Rho proteins from cellular membranes incell-free systems. This indicates that it could regulate the availableRho proteins associated with membranes or facilitate movement of Rhofrom one membrane compartment to another. Rac proteins bound to Rho GDIhave also been identified as components of the NADPH oxidase system thatgenerates oxygen radicals in activated phagocytes. Rac and Rho GDI forma heterodimer required for oxidase stimulation in vitro. Along with twoother cytosolic factors, the components assemble into a membrane-boundcomplex which uses electrons from NADPH to generate superoxide anions.Recombinant Rac proteins in their GDP-bound state can replace therequirement for Rac and Rho GDI in this system. This indicates that RhoGDI can recognize the GTP-bound form of Rac and protect it from RacGAPs.

GTPase-activating proteins are disclosed in Table 1 in Boguski et al.,above. These include Ras GAP proteins. These proteins have low intrinsicGTPase activity and their inactivation is dependent on GAP in vivo. Ofthe Ras GAPs, neurofibromin, p120 GAP, Ira1, and Ira2 also havespecificity for Rac. Of the rap GAP family, Rap1GAP also has specificityfor Rac. Rho/Rac GAPs with specificity for Rac include Bcr, N-chimerin,rotund, p190, GRB-1/p85a, and 3BP-1.

Ras-like GTPases are targeted to membranes where they act by thepost-translational attachment of isoprenoid lipids (or prenyl groups).Prenylation involves the covalent thioether linkage of farnesyl(15-carbon) or geranylgeranyl (20-carbon) groups to cysteine residuesnear the C-terminus. These reactions are catalyzed by prenyltransferasesthat differ in their isoprenoid substrates and protein targets. Type 1geranylgeranyl transferase recognizes a CAAX motif but prefers a leucineresidue in the X-position. Substrates include members of Rho/Racfamilies.

p21-activated protein kinases (PAKs) are activated through directinteraction with the GTPases Rac and Cdc42Hs. These GTPases areimplicated in the control of mitogen-activated protein kinase (MAP)kinase c-Jun N-terminal kinase (JNK) and the reorganization of the actincytoskeleton. Recently, Aronheim et al. (Current Biology 8:1125-1128(1998)) reported on the biological role of PAK2 and identified itsmolecular targets. A two-hybrid system, “the Ras recruitment system” wasused to detect protein-protein interactions at the inner surface of theplasma membranes. The PAK2 regulatory domain was fused at the carboxyterminus of a Ras mutant protein and screened against a cDNA library.Four clones were identified that interacted specifically with PAKregulatory region and were shown to encode a homolog of the GTPaseCdc42Hs. This protein, designated Chp, showed an overall sequenceidentity to Cdc42Hs of approximately 52%. Results from microinjection ofthis protein into cells implicated it in the induction of lamellipodiaand showed that it activates the JNK MAP kinase cascade.

Proteins regulating Ras and its relatives have been reviewed in Boguskiet al., Nature 366: 643-654 (1993), summarized below. As indicatedabove, GTPases cycle between inactive and active states bound to GDP andGTP respectively. As indicated above, cycling can be influenced by threedifferent classes of proteins that switch the GTPase on, switch it off,and protect it from switching. Classes of regulatory proteins ofRas-like GTPases include GEF, GDI, and GAP. GEFs catalyze exchange ofGDP for GTP. GAPs catalyze conversion of GTP-bound forms back to theirinactive GDP states. GDI proteins for Rab and Rho affect nucleotidedissociation and GAP attack and may also be involved in membranelocalization and solubility. The intracellular location of the GTPasecan be controlled by a fourth class of regulatory protein affecting theregulators with which the GTPase can interact.

Table 1 of Boguski et al. lists various GAPs, the organisms from whichthey are derived, substrate specificity, and other characterization.These include (in the Table) the following GAPs: RasGAP; Neurofibromin(NF1) with a positive specificity for H-ras, N-ras, K-ras, RAS1 and RAS2and a negative specificity for Rho, Rac, and Rab; p120GAP with apositive specificity for H-ras, N-ras, K-ras, R-ras, RAS1 and RAS2 and anegative specificity for Rho, Rac and Rab; Gap1 with a positivespecificity for Ras1; Ira1 with a positive specificity for RAS and RAS2and a negative specificity for Rho, Rac and Rab and potentially H-ras;Ira2 with a positive specificity for RAS and RAS2 and a negativespecificity Rho, Rac and Rab and potentially H-ras; Sar1/gap1 with apositive specificity for Ras1, RAS1 and Ras2; Bud2 with a positivespecificity for Bud1; RapGAP and Rap1GAP with a positive specificity forRap1A and Rap2 and a negative specificity for Ras, Rho and Rac;Rho/racGAP and Bar with a positive specificity for Rac and CDC42Hs and anegative specificity for Rho and Ras; n-Chimaerin with a positivespecificity for Rac and a negative specificity for Rho, CDC42Hs and Ras;rotund locus and p 190 with a positive specificity for Rac, Rho andCDC42Hs and a negative specificity for Ras, GRB-1/p85a and 3BP-1.

RasGAP is one class of GAP. Ras proteins have a very low intrinsicGTPase activity and their inactivation is dependent on GAPs in vivo. Forexample, some oncogenic mutants of Ras proteins are resistant toGAP-mediated GTPase stimulation and are constitutively blocked in theiractive GTP-bound states. Yeast contains two RasGAP proteins, IRA1 andIRA2 which contain domains homologous to the human and other mammalianp120-GAPs. In the absence of IRA gene product, yeast RAS proteinsaccumulate in their GTP-bound state, becoming hyperactive and leading tooverproduction of cAMP. In yeast, therefore, RasGAPs are not effectorsbut serve as negative regulators. NF1 is a human protein defective invon Recklinghausen neurofibromatosis. This protein contains a domainhomologous to the catalytic domains of p120-GAP IRA1 and IRA2. It may,in fact, be the mammalian homolog of IRA1 and IRA2. Mutant NF1 allelesare associated with sporadic cancers unrelated to neurofibromatosis orto neural crest tissues. Drosophila contains a protein, 70% identical toneurofibromin. It also contains a distinct RasGAP (referred to as GAP1)that is a component of the Sos tyrosine kinase/Ras1 signalling pathway.Loss of GAP1 stimulates Ras1 function, indicating that it is a negativeregulator.

RapGAP is another GAP class. Rap1A is around 50% identical to Ras and,like Ras, binds to p120-GAP and to raf1 by its effector binding domain.Rap1A binds p120-GAP but its GTPase activity is not enhanced by thisinteraction. Another protein, rap1GAP, is responsible for the Rap1AGTPase activation. Rap1GAP is unrelated to rasGAP but contains severalsites for phosphorylation by Cdc2 and cAMP-dependent kinases. Rasproteins, and most GTPases, depend on a glutamine residue at position 61(or equivalent) for intrinsic or GAP-mediated GTP hydrolysis. Rap1,however, has a threonine at this position.

Rho/Rac GAP is another class of GAP. A mammalian GAP specific for Rhohas been purified and shown to contain a region related to theC-terminal domain of Bcr and to a human brain protein, n-chimaerin. Bcris a putative RhoGEF. Bcr and n-chimaerin stimulate GTP hydrolysis bythe Rho-like proteins Rac1 and Rac2, but not by Rho proteins themselves.This activity is mediated by the C-terminal 401 amino acids of Bcr. Thisdomain does not resemble RasGAP or Rap1GAP. Chimaerin also contains anN-terminal DAG binding motif. Further, a multidomain protein, p90, thatbinds to p120-GAP and regulates its activity contains a central domainrelated to a putative DNA binding transcriptional repressor. At theC-terminus, there is a 145 residue region that is related to RhoGAPs.

GTPase activators (GAPs) are a major target for drug action anddevelopment. Accordingly, it is valuable to the field of pharmaceuticaldevelopment to identify and characterize previously unknown GAPs. Thepresent invention advances the state of the art by providing previouslyunidentified human GAPs.

SUMMARY OF THE INVENTION

It is an object of the invention to identify novel GAPs.

It is a further object of the invention to provide novel GAPpolypeptides that are useful as reagents or targets in assays applicableto treatment and diagnosis of GAP-mediated disorders.

It is a further object of the invention to provide polynucleotidescorresponding to the novel polypeptides that are useful as targets andreagents in assays applicable to treatment and diagnosis of GAP-mediateddisorders and useful for producing novel GAP polypeptides by recombinantmethods.

A specific object of the invention is to identify compounds that act asagonists and antagonists and modulate the expression or activity of thenovel GAP.

A further specific object of the invention is to provide compounds thatmodulate expression of the GAP for treatment and diagnosis ofGAP-related disorders.

The invention is thus based on the identification of two novel GAPs,designated herein 26651 and 26138.

The invention provides isolated GAP polypeptides including a polypeptidehaving an amino acid sequence shown in SEQ ID NO:22, SEQ ID NO:25, or anamino acid sequence encoded by the cDNA deposited with the ATCC asPTA-1918 on May 25, 2000 (“the deposited cDNA”).

The invention also provides isolated GAP nucleic acid molecules having asequence shown in SEQ ID NO:21, 23, 24, or 26, or in the deposited cDNA.

The invention also provides variant polypeptides having an amino acidsequence that is substantially homologous to an amino acid sequenceshown in SEQ ID NO:22, SEQ ID NO:25, or encoded by the deposited cDNA.

The invention also provides variant nucleic acid sequences that aresubstantially homologous to a nucleotide sequence shown in SEQ ID NO:21,23, 24, or 26, or in the deposited cDNA.

The invention also provides fragments of polypeptides shown in SEQ IDNO:22 or SEQ ID NO:25 and polynucleotides shown in SEQ ID NO:21, 23, 24,or 26, as well as substantially homologous fragments of the polypeptideor nucleic acid.

The invention further provides nucleic acid constructs comprising thenucleic acid molecules described above. In a preferred embodiment, thenucleic acid molecules of the invention are operatively linked to aregulatory sequence.

The invention also provides vectors and host cells for expressing theGAP nucleic acid molecules and polypeptides and particularly recombinantvectors and host cells.

The invention also provides methods of making the vectors and host cellsand methods for using them to produce the GAP nucleic acid molecules andpolypeptides.

The invention also provides antibodies or antigen-binding fragmentsthereof that selectively bind the GAP polypeptides and fragments.

The invention also provides methods of screening for compounds thatmodulate expression or activity of the GAP polypeptides or nucleic acid(RNA or DNA).

The invention also provides a process for modulating the GAP polypeptideor nucleic acid expression or activity, especially using the screenedcompounds. Modulation may be used to treat conditions related toaberrant activity or expression of the GAP polypeptides or nucleicacids.

The invention also provides assays for determining the presence orabsence of and level of the GAP polypeptides or nucleic acid moleculesin a biological sample, including for disease diagnosis.

The invention also provides assays for determining the presence of amutation in the GAP polypeptides or nucleic acid molecules, includingfor disease diagnosis.

In still a further embodiment, the invention provides a computerreadable means containing the nucleotide and/or amino acid sequences ofthe nucleic acids and polypeptides of the invention, respectively.

DETAILED DESCRIPTION OF THE INVENTION

The present inventions now will be described more fully hereinafter withreference to the accompanying drawings, in which some, but not allembodiments of the invention are shown. Indeed, these inventions may beembodied in many different forms and should not be construed as limitedto the embodiments set forth herein; rather, these embodiments areprovided so that this disclosure will satisfy applicable legalrequirements. Like numbers refer to like elements throughout.

Many modifications and other embodiments of the inventions set forthherein will come to mind to one skilled in the art to which theseinventions pertain having the benefit of the teachings presented in theforegoing descriptions and the associated drawings. Therefore, it is tobe understood that the inventions are not to be limited to the specificembodiments disclosed and that modifications and other embodiments areintended to be included within the scope of the appended claims.Although specific terms are employed herein, they are used in a genericand descriptive sense only and not for purposes of limitation.

Receptor Function/Signal Pathway

As used herein, a “signaling pathway” refers to the modulation (e.g.,stimulation or inhibition) of a cellular function/activity upon thebinding of a ligand to a GPCR. Examples of such functions includemobilization of intracellular molecules that participate in a signaltransduction pathway, e.g., phosphatidylinositol 4,5-bisphosphate(PIP₂), inositol 1,4,5-triphosphate (IP3) and adenylate cyclase;polarization of the plasma membrane; production or secretion ofmolecules; alteration in the structure of a cellular component; cellproliferation, e.g., synthesis of DNA; cell migration; celldifferentiation; and cell survival.

Since the 22651 GAP is expressed in tissues that include, but are notlimited to, adrenal gland, pituitary, skin and spinal cord, cellsparticipating in a receptor protein signaling pathway in which thisprotein is involved may include, but are not limited to cells derivedfrom these tissues.

Since the 26138 GAP is expressed in tonsil, spleen, fetal liver, adultliver, fibrotic liver, granulocytes, neutrophils, erythroid cells,adipose tissue, bone marrow, colon, lung, kidney, heart, lymphocyte,megakaryocytes and T-cells, among others, cells participating in areceptor protein signaling pathway in which this protein is involved mayinclude, but are not limited to cells derived from these tissues as wellas those tissues and cell lines shown in FIGS. 64A-64B.

The response mediated by a receptor protein depends on the type of cell.For example, in some cells, binding of a ligand to the receptor proteinmay stimulate an activity such as release of compounds, gating of achannel, cellular adhesion, migration, differentiation, etc., throughphosphatidylinositol or cyclic AMP metabolism and turnover while inother cells, the binding of the ligand will produce a different result.Regardless of the cellular activity/response modulated by the receptorprotein, the protein, as a GPCR, would interact with G proteins toproduce one or more secondary signals, in a variety of intracellularsignal transduction pathways, e.g., through phosphatidylinositol orcyclic AMP metabolism and turnover, in a cell.

As used herein, “phosphatidylinositol turnover and metabolism” refers tothe molecules involved in the turnover and metabolism ofphosphatidylinositol 4,5-bisphosphate (PIP₂) as well as to theactivities of these molecules. PIP₂ is a phospholipid found in thecytosolic leaflet of the plasma membrane. Binding of ligand to thereceptor activates, in some cells, the plasma-membrane enzymephospholipase C that in turn can hydrolyze PIP2 to produce1,2-diacylglycerol (DAG) and inositol 1,4,5-triphosphate (IP3). Onceformed IP3 can diffuse to the endoplasmic reticulum surface where it canbind an IP3 receptor, e.g., a calcium channel protein containing an IP3binding site. IP3 binding can induce opening of the channel, allowingcalcium ions to be released into the cytoplasm. IP3 can also bephosphorylated by a specific kinase to form inositol1,3,4,5-tetraphosphate (IP4), a molecule which can cause calcium entryinto the cytoplasm from the extracellular medium. IP3 and IP4 cansubsequently be hydrolyzed very rapidly to the inactive productsinositol 1,4-biphosphate (IP2) and inositol 1,3,4-triphosphate,respectively. These inactive products can be recycled by the cell tosynthesize PIP₂. The other second messenger produced by the hydrolysisof PIP₂, namely 1,2-diacylglycerol (DAG), remains in the cell membranewhere it can serve to activate the enzyme protein kinase C. Proteinkinase C is usually found soluble in the cytoplasm of the cell, but uponan increase in the intracellular calcium concentration, this enzyme canmove to the plasma membrane where it can be activated by DAG. Theactivation of protein kinase C in different cells results in variouscellular responses such as the phosphorylation of glycogen synthase, orthe phosphorylation of various transcription factors, e.g., NF-kB. Thelanguage “phosphatidylinositol activity”, as used herein, refers to anactivity of PIP₂ or one of its metabolites.

Another signaling pathway in which a receptor may participate is thecAMP turnover pathway. As used herein, “cyclic AMP turnover andmetabolism” refers to the molecules involved in the turnover andmetabolism of cyclic AMP (cAMP) as well as to the activities of thesemolecules. Cyclic AMP is a second messenger produced in response toligand-induced stimulation of certain G protein coupled receptors. Inthe cAMP signaling pathway, binding of a ligand to a GPCR can lead tothe activation of the enzyme adenyl cyclase, which catalyzes thesynthesis of cAMP. The newly synthesized cAMP can in turn activate acAMP-dependent protein kinase. This activated kinase can phosphorylate avoltage-gated potassium channel protein, or an associated protein, andlead to the inability of the potassium channel to open during an actionpotential. The inability of the potassium channel to open results in adecrease in the outward flow of potassium, which normally repolarizesthe membrane of a neuron, leading to prolonged membrane depolarization.

Polypeptides

The invention is based on the identification of novel human GAPs.Specifically, an expressed sequence tag (EST) was selected based onhomology to GAP sequences. This EST was used to design primers based onprimary sequences that it contains and used to identify a cDNA fromhuman cDNA libraries. Positive clones were sequenced and the overlappingfragments were assembled. Analysis of the assembled sequence revealedthat the cloned cDNA molecule encodes a GAP.

The invention thus relates to novel GAPs having the deduced amino acidsequence shown in FIGS. 52A-52B and 57A-57C (SEQ ID NO:22 and SEQ IDNO:25) or having the amino acid sequence encoded by the deposited cDNA,ATCC Patent Deposit No. PTA-1918.

Plasmids containing the 26651 sequences of the invention were depositedwith the Patent Depository of the American Type Culture Collection(ATCC), Manassas, Va., on May 25, 2000 and assigned Patent Deposit No.PTA-1918. The deposit will be maintained under the terms of the BudapestTreaty on the International Recognition of the Deposit ofMicroorganisms. The deposit is provided as a convenience to those ofskill in the art and is not an admission that a deposit is requiredunder 35 U.S.C. § 112. The deposited sequence, as well as thepolypeptide encoded by the sequence, is incorporated herein by referenceand controls in the event of any conflict, such as a sequencing error,with description in this application.

“GAP”, “GAP polypeptide” or “GAP protein” refer to a polypeptide setforth in SEQ ID NO:22, SEQ ID NO:25, or encoded by the deposited cDNA.The terms, however, further include the numerous variants describedherein, as well as fragments derived from the full-length GAPpolypeptide and variants.

The present invention thus provides an isolated or purified GAPpolypeptide and variants and fragments thereof. By “variants” isintended proteins or polypeptides having an amino acid sequence that isat least about 60%, 65%, or 70%, preferably about 75%, 85%, 95%, or 98%identical to the amino acid sequence of SEQ ID NO:22 or SEQ ID NO:25.Variants also include polypeptides encoded by the cDNA insert of theplasmid deposited with ATCC as Patent Deposit No. PTA-1918, orpolypeptides encoded by a nucleic acid molecule that hybridizes to thenucleic acid molecule of SEQ ID NO:21, 23, 24 or 26, or a complementthereof, under stringent conditions. In another embodiment, a variant ofan isolated polypeptide of the present invention differs, by at least 1,but less than 5, 10, 20, 50, or 100 amino acid residues from thesequence shown in SEQ ID NO:22 or SEQ ID NO:25. If alignment is neededfor this comparison the sequences should be aligned for maximumidentity. “Looped” out sequences from deletions or insertions, ormismatches, are considered differences. Such variants retain thefunctional activity of the polypeptide set forth in SEQ ID NO:22 or SEQID NO:25. Variants include polypeptides that differ in amino acidsequence due to natural allelic variation or mutagenesis.

Based on a BLAST search of the 26651 sequence, homology was shown tohuman and other mammalian Rho-GTPase activators. A search for completedomains in PFAM showed a classification in the RhoGAP family. PRODOManalysis also shows a relationship with Rho-type GTPase activatingproteins.

A search for complete domains in PFAM with the 26138 sequence showedclassification in the rasGAP family, GTPase-activator protein forRas-like GTPase.

26651 nucleic acid is expressed in tissues that include, but are notlimited to, adrenal gland, pituitary, skin and spinal cord. Chromosomemapping with STS using WI-13730 shows that the gene is located on the Xchromosome between DXS 994 and DXS 1062 (143.2-145 cM).

The 26138 nucleic acid is expressed in tissues that include, but are notlimited to, tonsil, spleen, fetal liver, adult liver, fibrotic liver,granulocytes, neutrophils, erythroid cells, adipose tissue, bone marrow,colon, lung, kidney, heart, lymphocyte, megakaryocytes and T-cells, aswell as the tissues and cell lines shown in FIGS. 64A-64B. Chromosomemapping information for this gene is shown in FIG. 63.

As used herein, a polypeptide is said to be “isolated” or “purified”when it is substantially free of cellular material when it is isolatedfrom recombinant and non-recombinant cells, or free of chemicalprecursors or other chemicals when it is chemically synthesized. Apolypeptide, however, can be joined to another polypeptide with which itis not normally associated in a cell and still be considered “isolated”or “purified.”

The GAP polypeptides can be purified to homogeneity. It is understood,however, that preparations in which the polypeptide is not purified tohomogeneity are useful and considered to contain an isolated form of thepolypeptide. The critical feature is that the preparation allows for thedesired function of the polypeptide, even in the presence ofconsiderable amounts of other components. Thus, the inventionencompasses various degrees of purity.

In one embodiment, the language “substantially free of cellularmaterial” includes preparations of the GAP polypeptide having less thanabout 30% (by dry weight) other proteins (i.e., contaminating protein),less than about 20% other proteins, less than about 10% other proteins,or less than about 5% other proteins. When the polypeptide isrecombinantly produced, it can also be substantially free of culturemedium, i.e., culture medium represents less than about 20%, less thanabout 10%, or less than about 5% of the volume of the proteinpreparation.

A polypeptide is also considered to be isolated when it is part of amembrane preparation or is purified and then reconstituted with membranevesicles or liposomes.

The language “substantially free of chemical precursors or otherchemicals” includes preparations of the polypeptide in which it isseparated from chemical precursors or other chemicals that are involvedin its synthesis. In one embodiment, the language “substantially free ofchemical precursors or other chemicals” includes preparations of thepolypeptide having less than about 30% (by dry weight) chemicalprecursors or other chemicals, less than about 20% chemical precursorsor other chemicals, less than about 10% chemical precursors or otherchemicals, or less than about 5% chemical precursors or other chemicals.

In one embodiment, the polypeptide comprises an amino acid sequenceshown in SEQ ID NO:22 or SEQ ID NO:25. However, the invention alsoencompasses sequence variants. Variants include a substantiallyhomologous protein encoded by the same genetic locus in an organism,i.e., an allelic variant. Variants also encompass proteins derived fromother genetic loci in an organism, but having substantial homology to aGAP of SEQ ID NO:22 or SEQ ID NO:25. Variants also include proteinssubstantially homologous to the GAP but derived from another organism,i.e., an ortholog. Variants also include proteins that are substantiallyhomologous to GAP polypeptides of the invention that are produced bychemical synthesis. Variants also include proteins that aresubstantially homologous to the GAP that are produced by recombinantmethods. Variants retain the GAP activity of the polypeptides set forthin SEQ ID NO:22 or SEQ ID NO:25. It is understood, however, thatvariants exclude any amino acid sequences disclosed prior to theinvention.

As used herein, two amino acid or nucleotide sequences are substantiallyhomologous when the sequences have at least about 60%, 65%, 70%, 75%,80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity.A substantially homologous amino acid sequence, according to the presentinvention, will be encoded by a nucleic acid sequence hybridizing to thenucleic acid sequence, or portion thereof, of the sequence shown in SEQID NO:21, 23, 24, or 26 under stringent conditions as more fullydescribed below.

To determine the percent identity of two amino acid sequences or of twonucleic acid sequences, the sequences are aligned for optimal comparisonpurposes (e.g., gaps can be introduced in one or both of a first and asecond amino acid or nucleic acid sequence for optimal alignment andnon-homologous sequences can be disregarded for comparison purposes). Ina preferred embodiment, the length of a reference sequence aligned forcomparison purposes is at least 30%, preferably at least 40%, morepreferably at least 50%, even more preferably at least 60%, and evenmore preferably at least 70%, 80%, or 90% of the length of the referencesequence. The amino acid residues or nucleotides at corresponding aminoacid positions or nucleotide positions are then compared. When aposition in the first sequence is occupied by the same amino acidresidue or nucleotide as the corresponding position in the secondsequence, then the molecules are identical at that position (as usedherein amino acid or nucleic acid “identity” is equivalent to amino acidor nucleic acid “homology”). The percent identity between the twosequences is a function of the number of identical positions shared bythe sequences, taking into account the number of gaps, and the length ofeach gap, which need to be introduced for optimal alignment of the twosequences.

The comparison of sequences and determination of percent identitybetween two sequences can be accomplished using a mathematicalalgorithm. In a preferred embodiment, the percent identity between twoamino acid sequences is determined using the Needleman and Wunsch (1970)J. Mol. Biol. 48:444-453 algorithm which has been incorporated into theGAP program in the GCG software package (available at www.gcg.com),using either a Blossum 62 matrix or a PAM250 matrix, and a gap weight of16, 14, 12, 10, 8, 6, or 4 and a length weight of 1, 2, 3, 4, 5, or 6.In yet another preferred embodiment, the percent identity between twonucleotide sequences is determined using the GAP program in the GCGsoftware package (available at www.gcg.com), using a NWSgapdna.CMPmatrix and a gap weight of 40, 50, 60, 70, or 80 and a length weight of1, 2, 3, 4, 5, or 6. A particularly preferred set of parameters (and theone that should be used if the practitioner is uncertain about whatparameters should be applied to determine if a molecule is within asequence identity or homology limitation of the invention) is using aBlossum 62 scoring matrix with a gap open penalty of 12, a gap extendpenalty of 4, and a frameshift gap penalty of 5.

The percent identity between two amino acid or nucleotide sequences canbe determined using the algorithm of E. Meyers and W. Miller (1989)CABIOS 4:11-17 which has been incorporated into the ALIGN program(version 2.0), using a PAM120 weight residue table, a gap length penaltyof 12 and a gap penalty of 4.

The nucleic acid and protein sequences described herein can be used as a“query sequence” to perform a search against public databases to, forexample, identify other family members or related sequences. Suchsearches can be performed using the NBLAST and XBLAST programs (version2.0) of Altschul, et al. (1990) J. Mol. Biol. 215:403-10. BLASTnucleotide searches can be performed with the NBLAST program, score=100,wordlength=12 to obtain nucleotide sequences homologous to the 26651 or26138 nucleic acid molecules of the invention. BLAST protein searchescan be performed with the XBLAST program, score=50, wordlength=3 toobtain amino acid sequences homologous to the 26651 or 26138 proteinmolecules of the invention. To obtain gapped alignments for comparisonpurposes, Gapped BLAST can be utilized as described in Altschul et al.(1997) Nucleic Acids Res. 25(17):3389-3402. When utilizing BLAST andGapped BLAST programs, the default parameters of the respective programs(e.g., XBLAST and NBLAST) can be used. See www.ncbi.nlm.nih.gov.

The invention also encompasses polypeptides having a lower degree ofidentity but having sufficient similarity so as to perform one or moreof the same functions performed by a polypeptide of the invention.Similarity is determined by conserved amino acid substitution. Suchsubstitutions are those that substitute a given amino acid in apolypeptide by another amino acid of like characteristics. Conservativesubstitutions are likely to be phenotypically silent. Typically seen asconservative substitutions are the replacements, one for another, amongthe aliphatic amino acids Ala, Val, Leu, and Ile; interchange of thehydroxyl residues Ser and Thr, exchange of the acidic residues Asp andGlu, substitution between the amide residues Asn and Gln, exchange ofthe basic residues Lys and Arg and replacements among the aromaticresidues Phe, Tyr. Guidance concerning which amino acid changes arelikely to be phenotypically silent are found in Bowie et al., Science247:1306-1310 (1990).

TABLE 1 Conservative Amino Acid Substitutions. Aromatic PhenylalanineTryptophan Tyrosine Hydrophobic Leucine Isoleucine Valine PolarGlutamine Asparagine Basic Arginine Lysine Histidine Acidic AsparticAcid Glutamic Acid Small Alanine Serine Threonine Methionine Glycine

A variant polypeptide can differ in amino acid sequence by one or moresubstitutions, deletions, insertions, inversions, fusions, andtruncations or a combination of any of these.

Variant polypeptides can be fully functional or can lack function in oneor more activities. Thus, in the present case, variations can affect thefunction, for example, of one or more regions corresponding to, membraneassociation, GTPase binding, interaction with regulatory proteins suchas those in the background above.

Fully functional variants typically contain only conservative variationor variation in non-critical residues or in non-critical regions.Functional variants can also contain substitution of similar amino acidswhich result in no change or an insignificant change in function.Alternatively, such substitutions may positively or negatively affectfunction to some degree.

Non-functional variants typically contain one or more non-conservativeamino acid substitutions, deletions, insertions, inversions, ortruncation or a substitution, insertion, inversion, or deletion in acritical residue or critical region.

As indicated, variants can be naturally-occurring or can be made byrecombinant means or chemical synthesis to provide useful and novelcharacteristics for a polypeptide of the invention. This includespreventing immunogenicity from pharmaceutical formulations by preventingprotein aggregation.

Useful variations further include alteration of binding characteristics.For example, one embodiment involves a variation at the binding sitethat results in binding but not release, or slower release of a bindingmolecule. A further useful variation at the same sites can result in ahigher affinity. Useful variations also include changes that provide foraffinity for another binding molecule. Another useful variation includesone that allows binding but which prevents activation by an effector. Auseful variation affects binding to the GTPase, e.g., Ras or Rho.Binding can be with greater affinity, with less tendency to dissociateor lesser affinity with a higher tendency to dissociate. Alternatively,a variation can affect interaction with any of the regulatory proteinswhich in turn affects association with the GTPase. A further usefulvariation affects interaction with the regulatory protein responsiblefor subcellular localization of the GAP.

Another useful variation provides a fusion protein in which one or moredomains or subregions is operationally fused to one or more domains orsubregions from another GAP, including, but not limited to, subfamiliesdiscussed above in the background in the families of GTPase activators.

Amino acids that are essential for function can be identified by methodsknown in the art, such as site-directed mutagenesis or alanine-scanningmutagenesis (Cunningham et al., Science 244:1081-1085 (1989)). Thelatter procedure introduces single alanine mutations at every residue inthe molecule. The resulting mutant molecules are then tested forbiological activity such as receptor binding or in vitro, or in vivoproliferative activity. Sites that are critical for substrate oreffector binding can also be determined by structural analysis such ascrystallization, nuclear magnetic resonance or photoaffinity labeling(Smith et al., J. Mol. Biol. 224:899-904 (1992); de Vos et al. Science255:306-312 (1992)).

Substantial homology can be to the entire nucleic acid or amino acidsequence or to fragments of these sequences.

The invention thus also includes polypeptide fragments of the GAPs.Fragments can be derived from an amino acid sequence shown in SEQ IDNO:22 or SEQ ID NO:25. However, the invention also encompasses fragmentsof the variants of the proteins of the invention as described herein.

The fragments to which the invention pertains, however, are not to beconstrued as encompassing fragments that may be disclosed prior to thepresent invention.

As used herein, a fragment comprises at least 5 contiguous amino acids.Fragments can retain one or more of the biological activities of theprotein, for example the ability to bind to a GTPase, as well asfragments that can be used as an immunogen to generate antibodies.

Biologically active fragments (peptides which are about, for example,5-10, 10-15, 15-20, 25-30, 35-40, 40-50, 50-60, 60-70, 70-80, 80-90,90-100, 100-110, 110-120, 120-130, 130-150, 150-200, 200-250, 250-300,300-350, 350-400, 400-450, 450-500, 500-547 or up to the number of aminoacids in the full length sequence) can comprise a domain or motif, e.g.,a GTPase binding site, a regulatory site for interaction with any of theregulatory proteins affecting GAP activity, membrane anchoring site, orglycosylation sites, phosphorylation sites, and myristoylation sites.Such domains or motifs can be identified by means of routinecomputerized homology searching procedures. Domains/motifs include, butare not limited to, those shown in the figures.

Fragments also include combinations of domains or motifs including, butnot limited to, those mentioned above. Fragments, for example, canextend in one or both directions from the functional site to encompass5, 10, 15, 20, 30, 40, 50, 100, 150, 200, 250, 300, 350, 400, 450, 500,547, or up to the number of amino acids disclosed in SEQ ID NO:22 andSEQ ID NO:25. Further, fragments can include sub-fragments of thespecific domains mentioned above, which sub-fragments retain thefunction of the domain from which they are derived.

These regions can be identified by well-known methods involvingcomputerized homology analysis.

Fragments also include antigenic fragments and specifically those shownto have a high antigenic index in FIGS. 54 and 58.

Further possible fragments include but are not limited to fragmentsdefining a GTPase binding site, regulatory protein binding, or membraneassociation. By this is intended a discrete fragment that provides therelevant function or allows the relevant function to be identified. In apreferred embodiment, the fragment contains a GTPase-binding site.

The invention also provides fragments with immunogenic properties. Thesecontain an epitope-bearing portion of a protein of the invention andvariants. These epitope-bearing peptides are useful to raise antibodiesthat bind specifically to a polypeptide of the invention or region orfragment. These peptides can contain at least 6, 10, 12, at least 14, orbetween at least about 15 to about 30 amino acids.

A polypeptide of the invention (including variants and fragments whichmay have been disclosed prior to the present invention) are useful forbiological assays related to GAPs, especially of the RasGAP or RhoGAPfamily. Such assays involve any of the known GAP functions or activitiesor properties useful for diagnosis and treatment of GAP-relatedconditions. They include, especially, diseases involving the tissues inwhich a protein of the invention is expressed as disclosed herein. ForGAP activity, assays include but are not limited to those disclosedherein, including those in references cited in the background herein,which are incorporated herein by reference for teaching these assays.Such assays include but are not limited to GTPase binding or activation,binding to GAP regulatory proteins, complex formation with any of theregulatory proteins, and biological effects such as those disclosed inthe Background above. These include but are not limited toreorganization the actin cytoskeleton, transformation, growth, effectson differentiation, membrane ruffling induced by growth factors,formation of actin stress fibers, and generation of superoxide inphagocytes.

Disorders involving T-cells include, but are not limited to,cell-mediated hypersensitivity, such as delayed type hypersensitivityand T-cell-mediated cytotoxicity, and transplant rejection; autoimmunediseases, such as systemic lupus erythematosus, Sjögren syndrome,systemic sclerosis, inflammatory myopathies, mixed connective tissuedisease, and polyarteritis nodosa and other vasculitides; immunologicdeficiency syndromes, including but not limited to, primaryimmunodeficiencies, such as thymic hypoplasia, severe combinedimmunodeficiency diseases, and AIDS; leukopenia; reactive (inflammatory)proliferations of white cells, including but not limited to,leukocytosis, acute nonspecific lymphadenitis, and chronic nonspecificlymphadenitis; neoplastic proliferations of white cells, including butnot limited to lymphoid neoplasms, such as precursor T-cell neoplasms,such as acute lymphoblastic leukemia/lymphoma, peripheral T-cell andnatural killer cell neoplasms that include peripheral T-cell lymphoma,unspecified, adult T-cell leukemia/lymphoma, mycosis fungoides andSezary syndrome, and Hodgkin disease.

Diseases of the skin, include but are not limited to, disorders ofpigmentation and melanocytes, including but not limited to, vitiligo,freckle, melasma, lentigo, nevocellular nevus, dysplastic nevi, andmalignant melanoma; benign epithelial tumors, including but not limitedto, seborrheic keratoses, acanthosis nigricans, fibroepithelial polyp,epithelial cyst, keratoacanthoma, and adnexal (appendage) tumors;premalignant and malignant epidermal tumors, including but not limitedto, actinic keratosis, squamous cell carcinoma, basal cell carcinoma,and merkel cell carcinoma; tumors of the dermis, including but notlimited to, benign fibrous histiocytoma, dermatofibrosarcomaprotuberans, xanthomas, and dermal vascular tumors; tumors of cellularimmigrants to the skin, including but not limited to, histiocytosis X,mycosis fungoides (cutaneous T-cell lymphoma), and mastocytosis;disorders of epidermal maturation, including but not limited to,ichthyosis; acute inflammatory dermatoses, including but not limited to,urticaria, acute eczematous dermatitis, and erythema multiforme; chronicinflammatory dermatoses, including but not limited to, psoriasis, lichenplanus, and lupus erythematosus; blistering (bullous) diseases,including but not limited to, pemphigus, bullous pemphigoid, dermatitisherpetiformis, and noninflammatory blistering diseases: epidermolysisbullosa and porphyria; disorders of epidermal appendages, including butnot limited to, acne vulgaris; panniculitis, including but not limitedto, erythema nodosum and erythema induratum; and infection andinfestation, such as verrucae, molluscum contagiosum, impetigo,superficial fungal infections, and arthropod bites, stings, andinfestations.

In normal bone marrow, the myelocytic series (polymorphoneuclear cells)make up approximately 60% of the cellular elements, and the erythrocyticseries, 20-30%. Lymphocytes, monocytes, reticular cells, plasma cellsand megakaryocytes together constitute 10-20%. Lymphocytes make up 5-15%of normal adult marrow. In the bone marrow, cell types are add mixed sothat precursors of red blood cells (erythroblasts), macrophages(monoblasts), platelets (megakaryocytes), polymorphoneuclear leukocytes(myeloblasts), and lymphocytes (lymphoblasts) can be visible in onemicroscopic field. In addition, stem cells exist for the different celllineages, as well as a precursor stem cell for the committed progenitorcells of the different lineages. The various types of cells and stagesof each would be known to the person of ordinary skill in the art andare found, for example, on page 42 (FIG. 2-8) of Immunology,Immunopathology and Immunity, Fifth Edition, Sell et al. Simon andSchuster (1996), incorporated by reference for its teaching of celltypes found in the bone marrow. According, the invention is directed todisorders arising from these cells. These disorders include but are notlimited to the following: diseases involving hematopoietic stem cells;committed lymphoid progenitor cells; lymphoid cells including B andT-cells; committed myeloid progenitors, including monocytes,granulocytes, and megakaryocytes; and committed erythroid progenitors.These include but are not limited to the leukemias, including B-lymphoidleukemias, T-lymphoid leukemias, undifferentiated leukemias;erythroleukemia, megakaryoblastic leukemia, monocytic; [leukemias areencompassed with and without differentiation]; chronic and acutelymphoblastic leukemia, chronic and acute lymphocytic leukemia, chronicand acute myelogenous leukemia, lymphoma, myelo dysplastic syndrome,chronic and acute myeloid leukemia, myelomonocytic leukemia; chronic andacute myeloblastic leukemia, chronic and acute myelogenous leukemia,chronic and acute promyclocytic leukemia, chronic and acute myelocyticleukemia, hematologic malignancies of monocyte-macrophage lineage, suchas juvenile chronic myelogenous leukemia; secondary AML, antecedenthematological disorder; refractory anemia; aplastic anemia; reactivecutaneous angioendotheliomatosis; fibrosing disorders involving alteredexpression in dendritic cells, disorders including systemic sclerosis,E-M syndrome, epidemic toxic oil syndrome, cosinophilic fasciitislocalized forms of scleroderma, keloid, and fibrosing colonopathy;angiomatoid malignant fibrous histiocytoma; carcinoma, including primaryhead and neck squamous cell carcinoma; sarcoma, including kaposi'ssarcoma; fibroadenoma and phyllodes tumors, including mammaryfibroadenoma; stromal tumors; phyllodes tumors, including histiocytoma;erythroblastosis; neurofibromatosis; diseases of the vascularendothelium; demyelinating, particularly in old lesions; gliosis,vasogenic edema, vascular disease, Alzheimer's and Parkinson's disease;T-cell lymphomas; B-cell lymphomas.

Disorders involving the spleen include, but are not limited to,splenomegaly, including nonspecific acute splenitis, congestivespenomegaly, and spenic infarcts; neoplasms, congenital anomalies, andrupture. Disorders associated with splenomegaly include infections, suchas nonspecific splenitis, infectious mononucleosis, tuberculosis,typhoid fever, brucellosis, cytomegalovirus, syphilis, malaria,histoplasmosis, toxoplasmosis, kala-azar, trypanosomiasis,schistosomiasis, leishmaniasis, and echinococcosis; congestive statesrelated to partial hypertension, such as cirrhosis of the liver, portalor splenic vein thrombosis, and cardiac failure; lymphohematogenousdisorders, such as Hodgkin disease, non-Hodgkin lymphomas/leukemia,multiple myeloma, mycloproliferative disorders, hemolytic anemias, andthrombocytopenic purpura; immunologic-inflammatory conditions, such asrheumatoid arthritis and systemic lupus erythematosus; storage diseasessuch as Gaucher disease, Niemann-Pick disease, andmucopolysaccharidoses; and other conditions, such as amyloidosis,primary neoplasms and cysts, and secondary neoplasms.

Disorders involving blood vessels include, but are not limited to,responses of vascular cell walls to injury, such as endothelialdysfunction and endothelial activation and intimal thickening; vasculardiseases including, but not limited to, congenital anomalies, such asarteriovenous fistula, atherosclerosis, and hypertensive vasculardisease, such as hypertension; inflammatory disease—the vasculitides,such as giant cell (temporal) arteritis, Takayasu arteritis,polyarteritis nodosa (classic), Kawasaki syndrome (mucocutaneous lymphnode syndrome), microscopic polyanglitis (microscopic polyarteritis,hypersensitivity or leukocytoclastic anglitis), Wegener granulomatosis,thromboanglitis obliterans (Buerger disease), vasculitis associated withother disorders, and infectious arteritis; Raynaud disease; aneurysmsand dissection, such as abdominal aortic aneurysms, syphilitic (luetic)aneurysms, and aortic dissection (dissecting hematoma); disorders ofveins and lymphatics, such as varicose veins, thrombophlebitis andphlebothrombosis, obstruction of superior vena cava (superior vena cavasyndrome), obstruction of inferior vena cava (inferior vena cavasyndrome), and lymphangitis and lymphedema; tumors, including benigntumors and tumor-like conditions, such as hemangioma, lymphangioma,glomus tumor (glomangioma), vascular ectasias, and bacillaryangiomatosis, and intermediate-grade (borderline low-grade malignant)tumors, such as Kaposi sarcoma and hemangloendothelioma, and malignanttumors, such as angiosarcoma and hemangiopericytoma; and pathology oftherapeutic interventions in vascular disease, such as balloonangioplasty and related techniques and vascular replacement, such ascoronary artery bypass graft surgery.

Disorders involving red cells include, but are not limited to, anemias,such as hemolytic anemias, including hereditary spherocytosis, hemolyticdisease due to erythrocyte enzyme defects: glucose-6-phosphatedehydrogenase deficiency, sickle cell disease, thalassemia syndromes,paroxysmal nocturnal hemoglobinuria, immunohemolytic anemia, andhemolytic anemia resulting from trauma to red cells; and anemias ofdiminished erythropoiesis, including megaloblastic anemias, such asanemias of vitamin B12 deficiency: pernicious anemia, and anemia offolate deficiency, iron deficiency anemia, anemia of chronic disease,aplastic anemia, pure red cell aplasia, and other forms of marrowfailure.

Disorders involving B-cells include, but are not limited to precursorB-cell neoplasms, such as lymphoblastic leukemia/lymphoma. PeripheralB-cell neoplasms include, but are not limited to, chronic lymphocyticleukemia/small lymphocytic lymphoma, follicular lymphoma, diffuse largeB-cell lymphoma, Burkitt lymphoma, plasma cell neoplasms, multiplemyeloma, and related entities, lymphoplasmacytic lymphoma (Waldenströmmacroglobulinemia), mantle cell lymphoma, marginal zone lymphoma(MALToma), and hairy cell leukemia.

Disorders related to reduced platelet number, thrombocytopenia, includeidiopathic thrombocytopenic purpura, including acute idiopathicthrombocytopenic purpura, drug-induced thrombocytopenia, HIV-associatedthrombocytopenia, and thrombotic microangiopathies: thromboticthrombocytopenic purpura and hemolytic-uremic syndrome.

Disorders involving precursor T-cell neoplasms include precursor Tlymphoblastic leukemia/lymphoma. Disorders involving peripheral T-celland natural killer cell neoplasms include T-cell chronic lymphocyticleukemia, large granular lymphocytic leukemia, mycosis fungoides andSezary syndrome, peripheral T-cell lymphoma, unspecified,angioimmunoblastic T-cell lymphoma, angiocentric lymphoma (NK/T-celllymphoma^(4a)), intestinal T-cell lymphoma, adult T-cellleukemia/lymphoma, and anaplastic large cell lymphoma.

Bone-forming cells include the osteoprogenitor cells, osteoblasts, andosteocytes. The disorders of the bone are complex because they may havean impact on the skeleton during any of its stages of development.Hence, the disorders may have variable manifestations and may involveone, multiple or all bones of the body. Such disorders include,congenital malformations, achondroplasia and thanatophoric dwarfism,diseases associated with abnormal matrix such as type 1 collagendisease, osteoporosis, Paget disease, rickets, osteomalacia,high-turnover osteodystrophy, low-turnover of aplastic disease,osteonecrosis, pyogenic osteomyelitis, tuberculous osteomyclitism,osteoma, osteoid osteoma, osteoblastoma, osteosarcoma, osteochondroma,chondromas, chondroblastoma, chondromyxoid fibroma, chondrosarcoma,fibrous cortical defects, fibrous dysplasia, fibrosarcoma, malignantfibrous histiocytoma, Ewing sarcoma, primitive neuroectodermal tumor,giant cell tumor, and metastatic tumors.

Disorders involving the tonsils include, but are not limited to,tonsillitis, Peritonsillar abscess, squamous cell carcinoma, dyspnea,hyperplasia, follicular hyperplasia, reactive lymphoid hyperplasia,non-Hodgkin's lymphoma and B-cell lymphoma.

Disorders involving the liver include, but are not limited to, hepaticinjury; jaundice and cholestasis, such as bilirubin and bile formation;hepatic failure and cirrhosis, such as cirrhosis, portal hypertension,including ascites, portosystemic shunts, and splenomegaly; infectiousdisorders, such as viral hepatitis, including hepatitis A-E infectionand infection by other hepatitis viruses, clinicopathologic syndromes,such as the carrier state, asymptomatic infection, acute viralhepatitis, chronic viral hepatitis, and fulminant hepatitis; autoimmunehepatitis; drug- and toxin-induced liver disease, such as alcoholicliver disease; inborn errors of metabolism and pediatric liver disease,such as hemochromatosis, Wilson disease, α₁-antitrypsin deficiency, andneonatal hepatitis; intrahepatic biliary tract disease, such assecondary biliary cirrhosis, primary biliary cirrhosis, primarysclerosing cholangitis, and anomalies of the biliary tree; circulatorydisorders, such as impaired blood flow into the liver, including hepaticartery compromise and portal vein obstruction and thrombosis, impairedblood flow through the liver, including passive congestion andcentrilobular necrosis and peliosis hepatis, hepatic vein outflowobstruction, including hepatic vein thrombosis (Budd-Chiari syndrome)and veno-occlusive disease; hepatic disease associated with pregnancy,such as preeclampsia and eclampsia, acute fatty liver of pregnancy, andintrehepatic cholestasis of pregnancy; hepatic complications of organ orbone marrow transplantation, such as drug toxicity after bone marrowtransplantation, graft-versus-host disease and liver rejection, andnonimmunologic damage to liver allografts; tumors and tumorousconditions, such as nodular hyperplasias, adenomas, and malignanttumors, including primary carcinoma of the liver and metastatic tumors.

Disorders involving the colon include, but are not limited to,congenital anomalies, such as atresia and stenosis, Meckel diverticulum,congenital aganglionic megacolon-Hirschsprung disease; enterocolitis,such as diarrhea and dysentery, infectious enterocolitis, includingviral gastroenteritis, bacterial enterocolitis, necrotizingenterocolitis, antibiotic-associated colitis (pseudomembranous colitis),and collagenous and lymphocytic colitis, miscellaneous intestinalinflammatory disorders, including parasites and protozoa, acquiredimmunodeficiency syndrome, transplantation, drug-induced intestinalinjury, radiation enterocolitis, neutropenic colitis (typhlitis), anddiversion colitis; idiopathic inflammatory bowel disease, such as Crohndisease and ulcerative colitis; tumors of the colon, such asnon-neoplastic polyps, adenomas, familial syndromes, colorectalcarcinogenesis, colorectal carcinoma, and carcinoid tumors.

Disorders involving the lung include, but are not limited to, congenitalanomalies; atelectasis; diseases of vascular origin, such as pulmonarycongestion and edema, including hemodynamic pulmonary edema and edemacaused by microvascular injury, adult respiratory distress syndrome(diffuse alveolar damage), pulmonary embolism, hemorrhage, andinfarction, and pulmonary hypertension and vascular sclerosis; chronicobstructive pulmonary disease, such as emphysema, chronic bronchitis,bronchial asthma, and bronchiectasis; diffuse interstitial(infiltrative, restrictive) diseases, such as pneumoconioses,sarcoidosis, idiopathic pulmonary fibrosis, desquamative interstitialpneumonitis, hypersensitivity pneumonitis, pulmonary eosinophilia(pulmonary infiltration with eosinophilia), Bronchiolitisobliterans-organizing pneumonia, diffuse pulmonary hemorrhage syndromes,including Goodpasture syndrome, idiopathic pulmonary hemosiderosis andother hemorrhagic syndromes, pulmonary involvement in collagen vasculardisorders, and pulmonary alveolar proteinosis; complications oftherapies, such as drug-induced lung disease, radiation-induced lungdisease, and lung transplantation; tumors, such as bronchogeniccarcinoma, including paraneoplastic syndromes, bronchioloalveolarcarcinoma, neuroendocrine tumors, such as bronchial carcinoid,miscellaneous tumors, and metastatic tumors; pathologies of the pleura,including inflammatory pleural effusions, noninflammatory pleuraleffusions, pneumothorax, and pleural tumors, including solitary fibroustumors (pleural fibroma) and malignant mesothelioma.

The epitope-bearing polypeptides may be produced by any conventionalmeans (Houghten, R. A., Proc. Natl. Acad. Sci. USA 82:5131-5135 (1985)).Simultaneous multiple peptide synthesis is described in U.S. Pat. No.4,631,211.

Fragments can be discrete (not fused to other amino acids orpolypeptides) or can be within a larger polypeptide. Further, severalfragments can be comprised within a single larger polypeptide. In oneembodiment a fragment designed for expression in a host can haveheterologous pre- and pro-polypeptide regions fused to the aminoterminus of the polypeptide fragment and an additional region fused tothe carboxyl terminus of the fragment.

The invention thus provides chimeric or fusion proteins. These comprisea protein of the invention operatively linked to a heterologous proteinhaving an amino acid sequence not substantially homologous to theprotein of the invention. In the case where an expression cassettecontains two protein coding regions joined in a contiguous manner in thesame reading frame, the encoded polypeptide is herein defined as a“heterologous polypeptide” or a “chimeric polypeptide” or a “fusionpolypeptide”. As used herein, a GAP “heterologous protein” or “chimericprotein” or “fusion protein” comprises a GAP polypeptide operably linkedto a non-GAP polypeptide. The heterologous protein can be fused to theN-terminus or C-terminus of the protein of the invention. “Operativelylinked” indicates that the protein of the invention and the heterologousprotein are fused in-frame.

In one embodiment the fusion protein does not affect GAP function perse. For example, the fusion protein can be a GST-fusion protein in whichthe sequences of the invention are fused to the N- or C-terminus of theGST sequences. Other types of fusion proteins include, but are notlimited to, enzymatic fusion proteins, for example beta-galactosidasefusions, yeast two-hybrid GAL-4 fusions, poly-His fusions and Igfusions. Such fusion proteins, particularly poly-His fusions, canfacilitate the purification of a recombinant protein of the invention.In certain host cells (e.g., mammalian host cells), expression and/orsecretion of a protein can be increased by using a heterologous signalsequence. Therefore, in another embodiment, the fusion protein containsa heterologous signal sequence at its C- or N-terminus.

EP-A-O 464 533 discloses fusion proteins comprising various portions ofimmunoglobulin constant regions. The Fc is useful in therapy anddiagnosis and thus results, for example, in improved pharmacokineticproperties (EP-A 0232 262). In drug discovery, for example, humanproteins have been fused with Fc portions for the purpose ofhigh-throughput screening assays to identify antagonists. Bennett et al.(J. Mol. Recog. 8:52-58 (1995)) and Johanson et al. (J. Biol. Chem. 270,16:9459-9471 (1995)). Thus, this invention also encompasses solublefusion proteins containing a polypeptide of the invention and variousportions of the constant regions of heavy or light chains ofimmunoglobulins of various subclass (IgG, IgM, IgA, IgE). Preferred asimmunoglobulin is the constant part of the heavy chain of human IgG,particularly IgG1, where fusion takes place at the hinge region. Forsome uses it is desirable to remove the Fc after the fusion protein hasbeen used for its intended purpose, for example when the fusion proteinis to be used as antigen for immunizations. In a particular embodiment,the Fc part can be removed in a simple way by a cleavage sequence whichis also incorporated and can be cleaved with factor Xa.

A chimeric or fusion protein can be produced by standard recombinant DNAtechniques. For example, DNA fragments coding for the different proteinsequences are ligated together in-frame in accordance with conventionaltechniques. In another embodiment, the fusion gene can be synthesized byconventional techniques including automated DNA synthesizers.Alternatively, PCR amplification of gene fragments can be carried outusing anchor primers which give rise to complementary overhangs betweentwo consecutive gene fragments which can subsequently be annealed andre-amplified to generate a chimeric gene sequence (see Ausubel et al.,Current Protocols in Molecular Biology, 1992). Moreover, many expressionvectors are commercially available that already encode a fusion moiety(e.g., a GST protein). A GAP-encoding nucleic acid of the invention canbe cloned into such an expression vector such that the fusion moiety islinked in-frame to the GAP.

Another form of fusion protein is one that directly affects the GAPfunctions. Accordingly, a polypeptide is encompassed by the presentinvention in which one or more of the domains (or parts thereof) hasbeen replaced by homologous domains (or parts thereof) from another GAP.Various permutations are possible. Thus, chimeric proteins can be formedin which one or more of the native domains, subregions, or motifs hasbeen replaced. A form of fusion protein is that in which GAP activatoror regulatory domains are derived from a different GAP family, includingbut not limited to those described in the background herein above, suchas RabGAP.

The isolated protein of the invention can be purified from cells thatnaturally express it, including but not limited to, those describedherein above, purified from cells that have been altered to express it(recombinant), or synthesized using known protein synthesis methods.

In one embodiment, the protein is produced by recombinant DNAtechniques. For example, a nucleic acid molecule encoding a polypeptideof the invention is cloned into an expression vector, the expressionvector introduced into a host cell and the protein expressed in the hostcell. The protein can then be isolated from the cells by an appropriatepurification scheme using standard protein purification techniques.Polypeptides often contain amino acids other than the 20 amino acidscommonly referred to as the 20 naturally-occurring amino acids. Further,many amino acids, including the terminal amino acids, may be modified bynatural processes, such as processing and other post-translationalmodifications, or by chemical modification techniques well known in theart. Common modifications that occur naturally in polypeptides aredescribed in basic texts, detailed monographs, and the researchliterature, and they are well known to those of skill in the art.

Accordingly, the polypeptides also encompass derivatives or analogs inwhich a substituted amino acid residue is not one encoded by the geneticcode, in which a substituent group is included, in which the maturepolypeptide is fused with another compound, such as a compound toincrease the half-life of the polypeptide (for example, polyethyleneglycol), or in which the additional amino acids are fused to the maturepolypeptide, such as a leader or secretory sequence or a sequence forpurification of the mature polypeptide or a pro-protein sequence.

Known modifications include, but are not limited to, acetylation,acylation, ADP-ribosylation, amidation, covalent attachment of flavin,covalent attachment of a heme moiety, covalent attachment of anucleotide or nucleotide derivative, covalent attachment of a lipid orlipid derivative, covalent attachment of phosphatidylinositol,cross-linking, cyclization, disulfide bond formation, demethylation,formation of covalent crosslinks, formation of cysteine, formation ofpyroglutamate, formylation, gamma carboxylation, glycosylation, GPIanchor formation, hydroxylation, iodination, methylation,myristoylation, oxidation, proteolytic processing, phosphorylation,prenylation, racemization, selenoylation, sulfation, transfer-RNAmediated addition of amino acids to proteins such as arginylation, andubiquitination.

Such modifications are well known to those of skill in the art and havebeen described in great detail in the scientific literature. Severalparticularly common modifications, glycosylation, lipid attachment,sulfation, gamma-carboxylation of glutamic acid residues, hydroxylationand ADP-ribosylation, for instance, are described in most basic texts,such as Proteins—Structure and Molecular Properties, 2nd Ed., T.E.Creighton, W.H. Freeman and Company, New York (1993). Many detailedreviews are available on this subject, such as by Wold, F.,Posttranslational Covalent Modification of Proteins, B. C. Johnson, Ed.,Academic Press, New York 1-12 (1983); Seifter et al. (Meth. Enzymol.182: 626-646 (1990)) and Rattan et al. (Ann. N.Y. Acad. Sci. 663:48-62(1992)).

As is also well known, polypeptides are not always entirely linear. Forinstance, polypeptides may be branched as a result of ubiquitination,and they may be circular, with or without branching, generally as aresult of post-translation events, including natural processing eventand events brought about by human manipulation which do not occurnaturally. Circular, branched and branched circular polypeptides may besynthesized by non-translational natural processes and by syntheticmethods.

Modifications can occur anywhere in a polypeptide, including the peptidebackbone, the amino acid side-chains and the amino or carboxyl termini.Blockage of the amino or carboxyl group in a polypeptide, or both, by acovalent modification, is common in naturally-occurring and syntheticpolypeptides. For instance, the amino terminal residue of polypeptidesmade in E. coli, prior to proteolytic processing, almost invariably willbe N-formylmethionine.

The modifications can be a function of how the protein is made. Forrecombinant polypeptides, for example, the modifications will bedetermined by the host cell posttranslational modification capacity andthe modification signals in the polypeptide amino acid sequence.Accordingly, when glycosylation is desired, a polypeptide should beexpressed in a glycosylating host, generally a eukaryotic cell. Insectcells often carry out the same posttranslational glycosylations asmammalian cells and, for this reason, insect cell expression systemshave been developed to efficiently express mammalian proteins havingnative patterns of glycosylation. Similar considerations apply to othermodifications.

The same type of modification may be present in the same or varyingdegree at several sites in a given polypeptide. Also, a givenpolypeptide may contain more than one type of modification.

Polypeptide Uses

The polypeptides of the invention are useful for producing antibodiesspecific for the protein, regions, or fragments. Regions having a highantigenicity index score are shown in FIGS. 54 and 58.

The polypeptides (including variants and fragments which may have beendisclosed prior to the present invention) are useful for biologicalassays related to GAPs. Such assays involve any of the known GAPfunctions or activities such as those described herein, such functionsor activities or properties being useful for diagnosis and treatment ofGAP-related conditions. Treatment is defined as the application oradministration of a therapeutic agent to a patient, or application oradministration of a therapeutic agent to an isolated tissue or cell linefrom a patient, who has a disease, a symptom of disease or apredisposition toward a disease, with the purpose to cure, heal,alleviate, relieve, alter, remedy, ameliorate, improve or affect thedisease, the symptoms of disease or the predisposition toward disease.“Subject”, as used herein, can refer to a mammal, e.g. a human, or to anexperimental or animal or disease model. The subject can also be anon-human animal, e.g. a horse, cow, goat, or other domestic animal. Atherapeutic agent includes, but is not limited to, small molecules,peptides, antibodies, ribozymes and antisense oligonucleotides.

The polypeptides of the invention are also useful in drug screeningassays, in cell-based or cell-free systems. Cell-based systems can benative, i.e., cells that normally express the protein, as a biopsy orexpanded in cell culture. For the various biological assays describedherein, these cells included but are not limited to, those disclosedabove. In one embodiment, however, cell-based assays involve recombinanthost cells expressing the protein.

Determining the ability of the test compound to interact with thepolypeptide can also comprise determining the ability of the testcompound to preferentially bind to the polypeptide as compared to theability of the substrate (i.e., GTPase) or effector (i.e., regulatorymolecule), or a biologically active portion thereof, to bind to thepolypeptide.

The polypeptides can be used to identify compounds that modulatepeptide, e.g., GAP activity. Such compounds, for example, can increaseor decrease affinity or rate of binding to a known substrate oreffector, compete with substrate or effector for binding, or displacebound substrate or effector. Both a protein of the invention andappropriate variants and fragments can be used in high-throughputscreens to assay candidate compounds for the ability to bind to aprotein of the invention (i.e., 26651 GAP or 26138 GAP). These compoundscan be further screened against a functional polypeptide of theinvention to determine the effect of the compound on the proteinactivity. Compounds can be identified that activate (agonist) orinactivate (antagonist) the protein to a desired degree. Modulatorymethods can be performed in vitro (e.g., by culturing the cell with theagent) or, alternatively, in vivo (e.g., by administering the agent to asubject).

The polypeptides can be used to screen a compound for the ability tostimulate or inhibit interaction between the protein and a targetmolecule that normally interacts with the GAP. The target can be aGTPase, regulatory protein, or other regulatory molecule or a componentof the signal pathway with which the GAP normally interacts. The assayincludes the steps of combining the protein of the invention with acandidate compound under conditions that allow the protein or fragmentto interact with the target molecule, and to detect the formation of acomplex between the protein and the target or to detect the biochemicalconsequence of the interaction with the protein and the target. When aprotein of the invention is involved in a specific signal pathway, thebiological consequence can include any of the associated effects ofsignal transduction such as G-protein phosphorylation, cyclic AMP orphosphatidylinositol turnover, and adenylate cyclase or phospholipase Cactivation, or any of the associated effects of GTPase activityincluding, but not limited to, programmed cell death (apoptosis),membrane trafficking, organization of the actin cytoskeleton, activationof protein kinases activated by direct interaction with GTPases, and inparticular, with Rho and Ras, membrane ruffling, formation of actinstress fibers, or generalized cellular effects such as transformation,and effects on growth and differentiation.

Determining the ability of the protein to bind to a target molecule canalso be accomplished using a technology such as real-time BimolecularInteraction Analysis (BIA). Sjolander, S, and Urbaniczky, C. (1991)Anal. Chem. 63:2338-2345 and Szabo et al. (1995) Curr. Opin. Struct.Biol. 5:699-705. As used herein, “BIA” is a technology for studyingbiospecific interactions in real time, without labeling any of theinteractants (e.g., BIAcore™). Changes in the optical phenomenon surfaceplasmon resonance (SPR) can be used as an indication of real-timereactions between biological molecules.

The test compounds of the present invention can be obtained using any ofthe numerous approaches in combinatorial library methods known in theart, including: biological libraries; spatially addressable parallelsolid phase or solution phase libraries; synthetic library methodsrequiring deconvolution; the ‘one-bead one-compound’ library method; andsynthetic library methods using affinity chromatography selection. Thebiological library approach is limited to polypeptide libraries, whilethe other four approaches are applicable to polypeptide, non-peptideoligomer or small molecule libraries of compounds (Lam, K. S. (1997)Anticancer Drug Des. 12:145).

Examples of methods for the synthesis of molecular libraries can befound in the art, for example in DeWitt et al. (1993) Proc. Natl. Acad.Sci. USA 90:6909; Erb et al. (1994) Proc. Natl. Acad. Sci. USA 91:11422;Zuckermann et al. (1994). J. Med. Chem. 37:2678; Cho et al. (1993)Science 261:1303; Carell et al. (1994) Angew. Chem. Int. Ed. Engl.33:2059; Carell et al. (1994) Angew. Chem. Int. Ed. Engl. 33:2061; andin Gallop et al. (1994) J. Med. Chem. 37:1233. Libraries of compoundsmay be presented in solution (e.g., Houghten (1992) Biotechniques13:412-421), or on beads (Lam (1991) Nature 354:82-84), chips (Fodor(1993) Nature 364:555-556), bacteria (Ladner U.S. Pat. No. 5,223,409),spores (Ladner U.S. Pat. No. '409), plasmids (Cull et al. (1992) Proc.Natl. Acad. Sci. USA 89:1865-1869) or on phage (Scott and Smith (1990)Science 249:386-390); (Devlin (1990) Science 249:404-406); (Cwirla etal. (1990) Proc. Natl. Acad. Sci. 97:6378-6382); (Felici (1991) J. Mol.Biol. 222:301-310); (Ladner supra).

Candidate compounds include, for example, 1) peptides such as solublepeptides, including Ig-tailed fusion peptides and members of randompeptide libraries (see, e.g., Lam et al., Nature 354:82-84 (1991);Houghten et al., Nature 354:84-86 (1991)) and combinatorialchemistry-derived molecular libraries made of D- and/or L-configurationamino acids; 2) phosphopeptides (e.g., members of random and partiallydegenerate, directed phosphopeptide libraries, see, e.g., Songyang etal., Cell 72:767-778 (1993)); 3) antibodies (e.g., polyclonal,monoclonal, humanized, anti-idiotypic, chimeric, and single chainantibodies as well as Fab, F(ab′)₂, Fab expression library fragments,and epitope-binding fragments of antibodies); and 4) small organic andinorganic molecules (e.g., molecules obtained from combinatorial andnatural product libraries).

One candidate compound is a soluble full-length protein of the inventionor fragment that competes for substrate or effector binding. Othercandidate compounds include mutant proteins of the invention orappropriate fragments containing mutations that affect protein functionand thus compete for substrate or effector. Accordingly, a fragment thatcompetes for substrate or effector, for example with a higher affinity,or a fragment that binds but does not allow release, is encompassed bythe invention. A candidate compound includes, but is not limited to, aGTPase analog that competes for native GTPase binding.

The invention provides other end points to identify compounds thatmodulate (stimulate or inhibit) protein activity. When the function of aprotein of the invention is related to a signal transduction pathway,the assays typically involve an assay of events in the signaltransduction pathway that indicate GAP activity. Thus, the expression ofgenes that are up- or down-regulated in response to the receptor proteindependent signal cascade can be assayed. For GAP function, assaystypically involve an assay of events in the pathway for example, GTPaseactivation or inhibition, GTP or GDP binding to a GTPase, and end pointssuch as membrane ruffling and effects on cytoskeletal organization,actin organization, and the like. In one embodiment, the regulatoryregion of such genes can be operably linked to a marker that is easilydetectable, such as luciferase. Alternatively, phosphorylation of aprotein of the invention, or a G-protein target, could also be measured.

Any of the biological or biochemical functions mediated by a protein ofthe invention can be used as an endpoint assay. These include all of thebiochemical or biochemical/biological events described herein, in thereferences cited herein, incorporated by reference for these endpointassay targets, and other functions known to those of ordinary skill inthe art.

Binding and/or activating regions, or domains, such as compounds canalso be screened by using chimeric proteins of the invention in whichregions or domains, such as the GTPase binding regions, catalytic (i.e.,activation or inhibition) regions, regions interacting with regulatoryproteins of GAP, or parts thereof, can be replaced by heterologousdomains or regions. Activation can also be detected by a reporter genecontaining an easily detectable coding region operably linked to atranscriptional regulatory sequence that is part of a signaltransduction pathway in which a GAP of the invention is involved.

The polypeptides of the invention are also useful in competition bindingassays in methods designed to discover compounds that interact with thepolypeptide. Thus, a compound is exposed to the polypeptide underconditions that allow the compound to bind or to otherwise interact withthe polypeptide. Soluble polypeptide of the invention is also added tothe mixture. If the test compound interacts with the solublepolypeptide, it decreases the amount of complex formed or activity fromthe target. This type of assay is particularly useful in cases in whichcompounds are sought that interact with specific regions of thepolypeptide. Thus, the soluble polypeptide that competes with the targetregion is designed to contain peptide sequences corresponding to theregion of interest.

To perform cell free drug screening assays, it is desirable toimmobilize either the protein, or fragment, or its target molecule tofacilitate separation of complexes from uncomplexed forms of one or bothof the proteins, as well as to accommodate automation of the assay.

Techniques for immobilizing proteins on matrices can be used in the drugscreening assays. In one embodiment, a fusion protein can be providedwhich adds a domain that allows the protein to be bound to a matrix. Forexample, glutathione-S-transferase/GAP fusion proteins can be adsorbedonto glutathione sepharose beads (Sigma Chemical, St. Louis, Mo.) orglutathione derivatized microtitre plates, which are then combined withthe cell lysates (e.g., ³⁵S-labeled) and the candidate compound, and themixture incubated under conditions conducive to complex formation (e.g.,at physiological conditions for salt and pH). Following incubation, thebeads are washed to remove any unbound label, and the matrix immobilizedand radiolabel determined directly, or in the supernatant after thecomplexes are dissociated. Alternatively, the complexes can bedissociated from the matrix, separated by SDS-PAGE, and the level ofGAP-binding protein found in the bead fraction quantified from the gelusing standard electrophoretic techniques. For example, either thepolypeptide or its target molecule can be immobilized utilizingconjugation of biotin and streptavidin using techniques well known inthe art. Alternatively, antibodies reactive with the protein but whichdo not interfere with binding of the protein to its target molecule canbe derivatized to the wells of the plate, and the protein trapped in thewells by antibody conjugation. Preparations of a GAP binding protein anda candidate compound are incubated in the GAP presenting wells and theamount of complex trapped in the well can be quantified. Methods fordetecting such complexes, in addition to those described above for theGST-immobilized complexes, include immunodetection of complexes usingantibodies reactive with the GAP target molecule, or which are reactivewith GAP and compete with the target molecule; as well as enzyme-linkedassays which rely on detecting an enzymatic activity associated with thetarget molecule.

Modulators of GAP activity identified according to these drug screeningassays can be used to treat a subject with a disorder mediated by aprotein of the invention, by treating cells that express a protein ofthe invention, such as those disclosed herein.

These methods of treatment include the steps of administering themodulators of protein activity in a pharmaceutical composition asdescribed herein, to a subject in need of such treatment.

The polypeptides of the invention are thus useful for treating aGAP-associated disorder characterized by aberrant expression or activityof a GAP. In one embodiment, the method involves administering an agent(e.g., an agent identified by a screening assay described herein), orcombination of agents that modulates (e.g., upregulates ordownregulates) expression or activity of the protein. In anotherembodiment, the method involves administering a protein as therapy tocompensate for reduced or aberrant expression or activity of theprotein.

Stimulation of protein activity is desirable in situations in which theprotein is abnormally downregulated and/or in which increased proteinactivity is likely to have a beneficial effect. Likewise, inhibition ofprotein activity is desirable in situations in which the protein isabnormally upregulated and/or in which decreased protein activity islikely to have a beneficial effect. An example of such a situationoccurs when the GAP is inactivating a protein and inhibition of the GAPallows activation of the protein (Chen et al. (1998) Neuron 20:895-904).In one example of such a situation, a subject has a disordercharacterized by aberrant development or cellular differentiation. Inanother example of such a situation, the subject has a proliferativedisease (e.g., cancer) or a disorder characterized by an aberranthematopoietic response. In another example of such a situation, it isdesirable to achieve tissue regeneration in a subject (e.g., where asubject has undergone brain or spinal cord injury and it is desirable toregenerate neuronal tissue in a regulated manner).

In yet another aspect of the invention, the proteins of the inventioncan be used as “bait proteins” in a two-hybrid assay or three-hybridassay (see, e.g., U.S. Pat. No. 5,283,317; Zervos et al. (1993) Cell72:223-232; Madura et al. (1993) J. Biol. Chem. 268:12046-12054; Bartelet al. (1993) Biotechniques 14:920-924; Iwabuchi et al. (1993) Oncogene8:1693-1696; and Brent WO 94/10300), to identify other proteins(captured proteins) which bind to or interact with the proteins of theinvention and modulate their activity.

The polypeptides of the invention also are useful to provide a targetfor diagnosing a disease or predisposition to disease mediated by a GAP,especially in diseases involving the tissues in which a protein of theinvention is expressed such as are disclosed herein. Accordingly,methods are provided for detecting the presence, or levels of, a proteinof the invention in a cell, tissue, or organism. The method involvescontacting a biological sample with a compound capable of interactingwith the protein such that the interaction can be detected.

One agent for detecting the protein is an antibody capable ofselectively binding to the protein. A biological sample includestissues, cells and biological fluids isolated from a subject, as well astissues, cells and fluids present within a subject.

The protein of the invention also provides a target for diagnosingactive disease, or predisposition to disease, in a patient having avariant protein of the invention. Thus, the protein can be isolated froma biological sample, assayed for the presence of a genetic mutation thatresults in an aberrant protein. This includes amino acid substitution,deletion, insertion, rearrangement, (as the result of aberrant splicingevents), and inappropriate post-translational modification. Analyticmethods include altered electrophoretic mobility, altered trypticpeptide digest, altered GAP activity in cell-based or cell-free assays,altered antibody-binding pattern, altered isoelectric point, directamino acid sequencing, and any other of the known assay techniquesuseful for detecting mutations in a protein.

In vitro techniques for detection of protein of the invention includeenzyme linked immunosorbent assays (ELISAs), Western blots,immunoprecipitations and immunofluorescence. Alternatively, the proteincan be detected in vivo in a subject by introducing into the subject alabeled anti-GAP antibody. For example, the antibody can be labeled witha radioactive marker whose presence and location in a subject can bedetected by standard imaging techniques. Particularly useful are methodswhich detect the allelic variant of the protein expressed in a subjectand methods which detect fragments of the protein in a sample.

The polypeptides are also useful in pharmacogenomic analysis.Pharmacogenomics deal with clinically significant hereditary variationsin the response to drugs due to altered drug disposition and abnormalaction in affected persons. See, e.g., Eichelbaum, M., Clin. Exp.Pharmacol. Physiol. 23(10-11):983-985 (1996), and Linder, M. W., Clin.Chem. 43(2):254-266 (1997). The clinical outcomes of these variationsresult in severe toxicity of therapeutic drugs in certain individuals ortherapeutic failure of drugs in certain individuals as a result ofindividual variation in metabolism. Thus, the genotype of the individualcan determine the way a therapeutic compound acts on the body or the waythe body metabolizes the compound. Further, the activity of drugmetabolizing enzymes effects both the intensity and duration of drugaction. Thus, the pharmacogenomics of the individual permit theselection of effective compounds and effective dosages of such compoundsfor prophylactic or therapeutic treatment based on the individual'sgenotype. The discovery of genetic polymorphisms in some drugmetabolizing enzymes has explained why some patients do not obtain theexpected drug effects, show an exaggerated drug effect, or experienceserious toxicity from standard drug dosages. Polymorphisms can beexpressed in the phenotype of the extensive metabolizer and thephenotype of the poor metabolizer. Accordingly, genetic polymorphism maylead to allelic protein variants in which one or more functions in onepopulation are different from those in another population. Thepolypeptides thus allow a target to ascertain a genetic predispositionthat can affect treatment modality. Thus, in a substrate oreffector-based treatment, polymorphism may give rise to domains and/orother binding regions that are more or less active in binding and/oractivation. Accordingly, dosage would necessarily be modified tomaximize the therapeutic effect within a given population containing apolymorphism. As an alternative to genotyping, specific polymorphicpolypeptides could be identified.

The polypeptides are also useful for monitoring therapeutic effectsduring clinical trials and other treatment. Thus, the therapeuticeffectiveness of an agent that is designed to increase or decrease geneexpression, protein levels or activity can be monitored over the courseof treatment using the polypeptides as an end-point target. Themonitoring can be, for example, as follows: (i) obtaining apre-administration sample from a subject prior to administration of theagent; (ii) detecting the level of expression or activity of a specifiedprotein in the pre-administration sample; (iii) obtaining one or morepost-administration samples from the subject; (iv) detecting the levelof expression or activity of the protein in the post-administrationsamples; (v) comparing the level of expression or activity of theprotein in the pre-administration sample with the protein in thepost-administration sample or samples; and (vi) increasing or decreasingthe administration of the agent to the subject accordingly.

The polypeptides are also useful for treating a GAP-associated disorder.Accordingly, methods for treatment include the use of soluble protein orfragments of the protein that compete for GTPase binding. These proteinsor fragments can have a higher affinity for the GTPase so as to provideeffective competition.

Antibodies

The invention also provides antibodies that selectively bind to aprotein of the invention and its variants and fragments. An antibody isconsidered to selectively bind, even if it also binds to other proteinsthat are not substantially homologous with the protein. These otherproteins share homology with a fragment or domain of the protein. Thisconservation in specific regions gives rise to antibodies that bind toboth proteins by virtue of the homologous sequence. In this case, itwould be understood that antibody binding to the protein is stillselective.

To generate antibodies, an isolated polypeptide is used as an immunogento generate antibodies using standard techniques for polyclonal andmonoclonal antibody preparation. Either the full-length protein orantigenic peptide fragment can be used. Regions having a highantigenicity index are shown in FIGS. 54 and 58.

Antibodies are preferably prepared from these regions or from discretefragments in these regions. However, antibodies can be prepared from anyregion of the peptide as described herein. A preferred fragment producesan antibody that diminishes or completely prevents GTPase binding.Antibodies can be developed against the entire protein or portions ofthe protein. Antibodies may also be developed against specificfunctional sites, such as the site of GTPase binding, or sites that arephosphorylated, myristoylated, or glycosylated.

An antigenic fragment will typically comprise at least 6 contiguousamino acid residues. The antigenic peptide can comprise a contiguoussequence of at least 12, at least 14 amino acid residues, at least 15amino acid residues, at least 20 amino acid residues, or at least 30amino acid residues. In one embodiment, fragments correspond to regionsthat are located on the surface of the protein, e.g., hydrophilicregions. These fragments are not to be construed, however, asencompassing any fragments which may be disclosed prior to theinvention.

Antibodies can be polyclonal or monoclonal. An intact antibody, or afragment thereof (e.g., Fab or F(ab′)₂) can be used.

Detection can be facilitated by coupling (i.e., physically linking) theantibody to a detectable substance. Examples of detectable substancesinclude various enzymes, prosthetic groups, fluorescent materials,luminescent materials, bioluminescent materials, and radioactivematerials. Examples of suitable enzymes include horseradish peroxidase,alkaline phosphatase, β-galactosidase, or acetylcholinesterase; examplesof suitable prosthetic group complexes include streptavidin/biotin andavidin/biotin; examples of suitable fluorescent materials includeumbelliferone, fluorescein, fluorescein isothiocyanate, rhodamine,dichlorotriazinylamine fluorescein, dansyl chloride or phycoerythrin; anexample of a luminescent material includes luminol; examples ofbioluminescent materials include luciferase, luciferin, and aequorin,and examples of suitable radioactive material include ¹²⁵I, ¹³¹I, ³⁵S or³H.

An appropriate immunogenic preparation can be derived from native,recombinantly expressed, protein or chemically synthesized peptides.

Antibody Uses

The antibodies can be used to isolate a protein by standard techniques,such as affinity chromatography or immunoprecipitation. The antibodiescan facilitate the purification of the natural protein from cells andrecombinantly produced protein expressed in host cells.

The antibodies are useful to detect the presence of the protein in cellsor tissues to determine the pattern of expression among various tissuesin an organism and over the course of normal development.

The antibodies can be used to detect the protein in situ, in vitro, orin a cell lysate or supernatant in order to evaluate the abundance andpattern of expression.

The antibodies can be used to assess abnormal tissue distribution orabnormal expression during development.

Antibody detection of circulating fragments of a full-length protein ofthe invention can be used to identify protein turnover.

Further, the antibodies can be used to assess the GAP expression indisease states such as in active stages of the disease or in anindividual with a predisposition toward disease related to the GAPfunction. When a disorder is caused by an inappropriate tissuedistribution, developmental expression, or level of expression of theprotein, the antibody can be prepared against the normal protein. If adisorder is characterized by a specific mutation in the protein,antibodies specific for this mutant protein can be used to assay for thepresence of the specific mutant protein. However, intracellularly-madeantibodies (“intrabodies”) are also encompassed, which would recognizeintracellular peptide regions.

The antibodies can also be used to assess normal and aberrantsubcellular localization of cells in the various tissues in an organism.Antibodies can be developed against the whole protein or portions, suchas those discussed herein.

The diagnostic uses can be applied, not only in genetic testing, butalso in monitoring a treatment modality. Accordingly, where treatment isultimately aimed at correcting the expression level or the presence ofan aberrant protein of the invention and aberrant tissue distribution ordevelopmental expression, antibodies directed against the protein orrelevant fragments can be used to monitor therapeutic efficacy.Antibodies accordingly can be used diagnostically to monitor proteinlevels in tissue as part of a clinical testing procedure, e.g., to, forexample, determine the efficacy of a given treatment regimen.

Additionally, antibodies are useful in pharmacogenomic analysis. Thus,antibodies prepared against polymorphic proteins of the invention can beused to identify individuals that require modified treatment modalities.

The antibodies are also useful as diagnostic tools as an immunologicalmarker for aberrant protein analyzed by electrophoretic mobility,isoelectric point, tryptic peptide digest, and other physical assaysknown to those in the art.

The antibodies are also useful for tissue typing. Thus, where a specificGAP of the invention has been correlated with expression in a specifictissue, antibodies that are specific for this protein can be used toidentify a tissue type.

The antibodies are also useful in forensic identification. Accordingly,where an individual has been correlated with a specific geneticpolymorphism resulting in a specific polymorphic protein, an antibodyspecific for the polymorphic protein can be used as an aid inidentification.

The antibodies are also useful for inhibiting protein function, forexample, blocking GTPase or regulatory molecule, e.g., protein, binding.

These uses can also be applied in a therapeutic context in whichtreatment involves inhibiting a function. An antibody can be used, forexample, to block GTPase binding. Antibodies can be prepared againstspecific fragments containing sites required for function or against anintact protein of the invention associated with a cell.

Completely human antibodies are particularly desirable for therapeutictreatment of human patients. For an overview of this technology forproducing human antibodies, see Lonberg and Huszar (1995, Int. Rev.Immunol. 13:65-93). For a detailed discussion of this technology forproducing human antibodies and human monoclonal antibodies and protocolsfor producing such antibodies, see, e.g., U.S. Pat. No. 5,625,126; U.S.Pat. No. 5,633,425; U.S. Pat. No. 5,569,825; U.S. Pat. No. 5,661,016;and U.S. Pat. No. 5,545,806.

The invention also encompasses kits for using antibodies to detect thepresence of a protein of the invention in a biological sample. The kitcan comprise antibodies such as a labeled or labelable antibody and acompound or agent for detecting the protein in a biological sample;means for determining the amount of the protein in the sample; and meansfor comparing the amount of the protein in the sample with a standard.The compound or agent can be packaged in a suitable container. The kitcan further comprise instructions for using the kit to detect theprotein.

Polynucleotides

The nucleotide sequences in SEQ ID NO:21, 23, 24, and 26 were obtainedby sequencing the deposited human full length cDNAs. Accordingly, thesequence of the deposited clone is controlling as to any discrepanciesbetween the two and any reference to a sequence of SEQ ID NO:21, 23, 24,or 26 includes reference to a sequence of the deposited cDNA.

The specifically disclosed cDNAs comprise the coding region and 5′ and3′ untranslated sequences (SEQ ID NO:21 or SEQ ID NO:24).

The invention provides isolated polynucleotides encoding a protein ofthe invention. The term “GAP polynucleotide,” “GAP nucleic acid,”“polynucleotide of the invention” or “nucleic acid of the invention”refers to a sequence shown in SEQ ID NO:21, 23, 24, 26, or in thedeposited cDNA. The terms further include variants and fragments of apolynucleotide of the invention.

An “isolated” nucleic acid is one that is separated from other nucleicacid present in the natural source of the nucleic acid. Preferably, an“isolated” nucleic acid is free of sequences which naturally flank thenucleic acid (i.e., sequences located at the 5′ and 3′ ends of thenucleic acid) in the genomic DNA of the organism from which the nucleicacid is derived. However, there can be some flanking nucleotidesequences, for example up to about 5 KB. The important point is that thenucleic acid is isolated from flanking sequences such that it can besubjected to the specific manipulations described herein such asrecombinant expression, preparation of probes and primers, and otheruses specific to GAP nucleic acid sequences.

Moreover, an “isolated” nucleic acid molecule, such as a cDNA or RNAmolecule, can be substantially free of other cellular material, orculture medium when produced by recombinant techniques, or chemicalprecursors or other chemicals when chemically synthesized. However, thenucleic acid molecule can be fused to other coding or regulatorysequences and still be considered isolated.

For example, recombinant DNA molecules contained in a vector areconsidered isolated. Further examples of isolated DNA molecules includerecombinant DNA molecules maintained in heterologous host cells orpurified (partially or substantially) DNA molecules in solution.Isolated RNA molecules include in vivo or in vitro RNA transcripts ofthe isolated DNA molecules of the present invention. Isolated nucleicacid molecules according to the present invention further include suchmolecules produced synthetically.

In some instances, the isolated material will form part of a composition(for example, a crude extract containing other substances), buffersystem or reagent mix. In other circumstances, the material may bepurified to essential homogeneity, for example as determined by PAGE orcolumn chromatography such as HPLC. Preferably, an isolated nucleic acidcomprises at least about 50, 80 or 90% (on a molar basis) of allmacromolecular species present.

The GAP polynucleotides can encode the mature protein plus additionalamino or carboxyl-terminal amino acids, or amino acids interior to themature polypeptide (when the mature form has more than one polypeptidechain, for instance). Such sequences may play a role in processing of aprotein from precursor to a mature form, facilitate protein trafficking,prolong or shorten protein half-life or facilitate manipulation of aprotein for assay or production, among other things. As generally is thecase in situ, the additional amino acids may be processed away from themature protein by cellular enzymes.

The GAP polynucleotides include, but are not limited to, the sequenceencoding the mature polypeptide alone, the sequence encoding the maturepolypeptide and additional coding sequences, such as a leader orsecretory sequence (e.g., a pre-pro or pro-protein sequence), thesequence encoding the mature polypeptide, with or without the additionalcoding sequences, plus additional non-coding sequences, for exampleintrons and non-coding 5′ and 3′ sequences such as transcribed butnon-translated sequences that play a role in transcription, mRNAprocessing (including splicing and polyadenylation signals), ribosomebinding and stability of mRNA. In addition, the polynucleotide may befused to a marker sequence encoding, for example, a peptide thatfacilitates purification.

Polynucleotides can be in the form of RNA, such as mRNA, or in the formof DNA, including cDNA and genomic DNA obtained by cloning or producedby chemical synthetic techniques or by a combination thereof. Thenucleic acid, especially DNA, can be double-stranded or single-stranded.Single-stranded nucleic acid can be the coding strand (sense strand) orthe non-coding strand (anti-sense strand).

One nucleic acid comprises a nucleotide sequence shown in SEQ ID NO:21,23, 24, or 26, corresponding to human cDNA.

In one embodiment, the nucleic acid comprises the coding regions setforth in SEQ ID NO:23 or 26.

The invention further provides variant polynucleotides, and fragmentsthereof, that differ from a nucleotide sequence shown in SEQ ID NO:21,23, 24, or 26 due to degeneracy of the genetic code and thus encode thesame polypeptides as those set forth in SEQ ID NO:22 or 25.

The invention also provides nucleic acid molecules encoding the variantpolypeptides described herein. Generally, nucleotide sequence variantsof the invention will have at least 60%, 65%, 70%, 75%, 80%, 85%, 90%,91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to thenucleotide sequences disclosed herein. Such polynucleotides may benaturally occurring, such as allelic variants (same locus), homologs(different locus), and orthologs (different organism), or may beconstructed by recombinant DNA methods or by chemical synthesis. Suchnon-naturally occurring variants may be made by mutagenesis techniques,including those applied to polynucleotides, cells, or organisms.Accordingly, as discussed above, the variants can contain nucleotidesubstitutions, deletions, inversions and insertions.

Variation can occur in either or both the coding and non-coding regions.The variations can produce both conservative and non-conservative aminoacid substitutions.

Typically, variants have a substantial identity with a nucleic acidmolecule of SEQ ID NO:21, 23, 24, or 26, or the complements thereof.

Orthologs, homologs, and allelic variants can be identified usingmethods well known in the art. These variants comprise a nucleotidesequence encoding a protein that is at least about 60%, 65%, 70%, 75%,80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or morehomologous to a nucleotide sequence shown in SEQ ID NO:21, 23, 24, 26,or a fragment of this sequence. Such nucleic acid molecules can readilybe identified as being able to hybridize under stringent conditions, toa nucleotide sequence shown in SEQ ID NO:21, 23, 24, 26, or a fragmentof the sequence. It is understood that stringent hybridization does notindicate substantial homology where it is due to general homology, suchas poly A sequences, or sequences common to all or most proteins, all ormost GAPs, or all or most RasGAPs or RhoGAPs. Moreover, it is understoodthat variants do not include any of the nucleic acid sequences that mayhave been disclosed prior to the invention.

As used herein, the term “hybridizes under stringent conditions”describes conditions for hybridization and washing. Stringent conditionsare known to those skilled in the art and can be found in CurrentProtocols in Molecular Biology John Wiley & Sons, N.Y. (1989),6.3.1-6.3.6. Aqueous and nonaqueous methods are described in thatreference and either can be used. A preferred, example of stringenthybridization conditions are hybridization in 6× sodium chloride/sodiumcitrate (SSC) at about 45° C., followed by one or more washes in0.2×SSC, 0.1% SDS at 50° C. Another example of stringent hybridizationconditions are hybridization in 6× sodium chloride/sodium citrate (SSC)at about 45° C., followed by one or more washes in 0.2×SSC, 0.1% SDS at55° C. A further example of stringent hybridization conditions arehybridization in 6× sodium chloride/sodium citrate (SSC) at about 45°C., followed by one or more washes in 0.2×SSC, 0.1% SDS at 60° C.Preferably, stringent hybridization conditions are hybridization in 6×sodium chloride/sodium citrate (SSC) at about 45° C., followed by one ormore washes in 0.2×SSC, 0.1% SDS at 65° C. Particularly preferredstringency conditions (and the conditions that should be used if thepractitioner is uncertain about what conditions should be applied todetermine if a molecule is within a hybridization limitation of theinvention) are 0.5M Sodium Phosphate, 7% SDS at 65° C., followed by oneor more washes at 0.2×SSC, 1% SDS at 65° C. Preferably, an isolatednucleic acid molecule of the invention that hybridizes under stringentconditions to the sequence of SEQ ID NO:21, 23, 24, or 26, correspondsto a naturally-occurring nucleic acid molecule.

As used herein, a “naturally-occurring” nucleic acid molecule refers toan RNA or DNA molecule having a nucleotide sequence that occurs innature (e.g., encodes a natural protein).

As understood by those of ordinary skill, the exact conditions can bedetermined empirically and depend on ionic strength, temperature and theconcentration of destabilizing agents such as formamide or denaturingagents such as SDS. Other factors considered in determining the desiredhybridization conditions include the length of the nucleic acidsequences, base composition, percent mismatch between the hybridizingsequences and the frequency of occurrence of subsets of the sequenceswithin other non-identical sequences. Thus, equivalent conditions can bedetermined by varying one or more of these parameters while maintaininga similar degree of identity or similarity between the two nucleic acidmolecules.

The present invention also provides isolated nucleic acids that containa single or double stranded fragment or portion that hybridizes understringent conditions to a nucleotide sequence of SEQ ID NO:21, 23, 24,26, or the complements thereof. In one embodiment, the nucleic acidconsists of a portion of a nucleotide sequence of SEQ ID NO:21, 23, 24,26 or complements thereof. The nucleic acid fragments of the inventionare at least about 10, 15, preferably at least about 20 or 25nucleotides, and can be 30, 38, 40, 50, 68, 75, 100, 150, 200, 250, 300,400, 500, 600, 700, 800, 900, 1000, 1100, 1200, 1300, 1400, 1500, 1600,1700, 1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2700, 2800,or 2847 nucleotides for SEQ ID NO:21. Alternatively, a nucleic acidmolecule that is a fragment of a 26651-like nucleotide sequence of thepresent invention comprises a nucleotide sequence consisting ofnucleotides 1-100, 100-200, 200-300, 300-400, 400-500, 500-600, 600-700,700-800, 800-900, 900-1000, 1000-1100, 1100-1200, 1200-1300, 1300-1400,1400-1500, 1500-1600, 1600-1700, 1700-1800, 1800-1900, 1900-2000,2000-2100, 2100-2200, 2200-2300, 2300-2400, 2400-2500, 2500-2600,2600-2700, 2700-2800, 2800-2847 of SEQ ID NO:21. The nucleic acidfragments of the invention are at least about 10, 15, 20, 25, 30, 38,40, 50, 68, 75, 100, 150, 200, 250, 300, 400, 500, 600, 700, 800, 900,1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900, 2000, 2100,2200, 2300, 2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100, 3200, 3300,or 3391 nucleotides for SEQ ID NO:24. Alternatively, a nucleic acidmolecule that is a fragment of a 26138-like nucleotide sequence of thepresent invention comprises a nucleotide sequence consisting ofnucleotides 1-100, 100-200, 200-300, 300-400, 400-500, 500-600, 600-700,700-800, 800-900, 900-1000, 1000-1100, 1100-1200, 1200-1300, 1300-1400,1400-1500, 1500-1600, 1600-1700, 1700-1800, 1800-1900, 1900-2000,2000-2100, 2100-2200, 2200-2300, 2300-2400, 2400-2500, 2500-2600,2600-2700, 2700-2800, 2800-2900, 2900-3000, 3000-3100, 3100-3200,3200-3300, 3300-3391 of SEQ ID NO:24. Longer fragments, for example, 30or more nucleotides in length, which encode antigenic proteins orpolypeptides described herein are useful.

Furthermore, the invention provides polynucleotides that comprise afragment of the full-length GAP polynucleotides. The fragment can besingle or double stranded and can comprise DNA or RNA. The fragment canbe derived from either the coding or the non-coding sequence.

In another embodiment an isolated nucleic acid encodes the entire codingregion. Other fragments include nucleotide sequences encoding the aminoacid fragments described herein. Further fragments can includesubfragments of the specific domains or sites described herein.Fragments also include nucleic acid sequences corresponding to specificamino acid sequences described above or fragments thereof. Nucleic acidfragments, according to the present invention, are not to be construedas encompassing those fragments that may have been disclosed prior tothe invention.

Nucleic acid fragments further include sequences corresponding to thedomains described herein, subregions also described, and specificfunctional sites. Nucleic acid fragments also include combinations ofthe domains, segments, loops, and other functional sites describedabove. A person of ordinary skill in the art would be aware of the manypermutations that are possible.

Where the location of the domains or sites have been predicted bycomputer analysis, one of ordinary skill would appreciate that the aminoacid residues constituting these domains can vary depending on thecriteria used to define the domains.

However, it is understood that a fragment includes any nucleic acidsequence that does not include the entire gene.

The invention also provides nucleic acid fragments that encode epitopebearing regions of the proteins described herein.

The isolated polynucleotide sequences, and especially fragments, areuseful as DNA probes and primers.

For example, the coding region of a gene of the invention can beisolated using the known nucleotide sequence to synthesize anoligonucleotide probe. A labeled probe can then be used to screen a cDNAlibrary, genomic DNA library, or mRNA to isolate nucleic acidcorresponding to the coding region. Further, primers can be used in PCRreactions to clone specific regions of these genes.

A probe/primer typically comprises substantially purifiedoligonucleotide. The oligonucleotide typically comprises a region ofnucleotide sequence that hybridizes under stringent conditions to atleast about 5, 10, 12, typically about 25, more typically about 40, 50or 75 consecutive nucleotides of the sense or antisense strand of SEQ IDNO:21, 23, 24, 26, or other GAP polynucleotides. A probe furthercomprises a label, e.g., radioisotope, fluorescent compound, enzyme, orenzyme co-factor.

Polynucleotide Uses

The nucleic acid fragments of the invention provide probes or primers inassays such as those described below. “Probes” are oligonucleotides thathybridize in a base-specific manner to a complementary strand of nucleicacid. Such probes include polypeptide nucleic acids, as described inNielsen et al. (1991) Science 254:1497-1500. Typically, a probecomprises a region of nucleotide sequence that hybridizes under highlystringent conditions to at least about 15, typically about 20-25, andmore typically about 40, 50 or 75 consecutive nucleotides of a nucleicacid of SEQ ID NO:21, 23, 24, 26, or the complements thereof. Moretypically, the probe further comprises a label, e.g., radioisotope,fluorescent compound, enzyme, or enzyme co-factor.

As used herein, the term “primer” refers to a single-strandedoligonucleotide which acts as a point of initiation of template-directedDNA synthesis using well-known methods (e.g., PCR, LCR) including, butnot limited to those described herein. The appropriate length of theprimer depends on the particular use, but typically ranges from about 15to 30 nucleotides. The term “primer site” refers to the area of thetarget DNA to which a primer hybridizes. The term “primer pair” refersto a set of primers including a 5′ (upstream) primer that hybridizeswith the 5′ end of the nucleic acid sequence to be amplified and a 3′(downstream) primer that hybridizes with the complement of the sequenceto be amplified.

The polynucleotides are useful for probes, primers, and in biologicalassays, including, but not limited to, methods using the cells andtissues in which the gene is expressed, particularly in which the geneis significantly expressed, and involving disorders including, but notlimited to, those also discussed herein above with respect to biologicalmethods and assays involving the GAP polypeptides of the invention.

Where the polynucleotides are used to assess or GAP properties orfunctions, such as in the assays described herein, all or less than allof the entire cDNA can be useful. In this case, even fragments that mayhave been known prior to the invention are encompassed. Thus, forexample, assays specifically directed to GAPs, and especially RasGAP orRhoGAP functions, such as assessing agonist or antagonist activity,encompass the use of known fragments. Further, diagnostic methods forassessing function can also be practiced with any fragment, includingthose fragments that may have been known prior to the invention.Similarly, in methods involving modulation or treatment of GAP-relateddysfunction, all fragments are encompassed including those which mayhave been known in the art.

The polynucleotides are useful as a hybridization probe for cDNA andgenomic DNA to isolate a full-length cDNA and genomic clones encoding apolypeptide described in SEQ ID NO:22 or SEQ ID NO:25 and to isolatecDNA and genomic clones that correspond to variants producing one of thesame polypeptides shown in SEQ ID NO:22, SEQ ID NO:25, or the othervariants described herein. Variants can be isolated from the same tissueand organism from which a polypeptide shown in SEQ ID NO:22 or SEQ IDNO:25 was isolated, different tissues from the same organism, or fromdifferent organisms. This method is useful for isolating genes and cDNAthat are developmentally-controlled and therefore may be expressed inthe same tissue or different tissues at different points in thedevelopment of an organism.

The probe can correspond to any sequence along the entire length of thegene encoding a protein of the invention. Accordingly, it could bederived from 5′ noncoding regions, the coding region, and 3′ noncodingregions. It is understood, however, as discussed herein, that fragmentscorresponding to the probe do not include those fragments that may havebeen disclosed prior to the present invention.

The nucleic acid probe can be, for example, a full-length cDNA thatencodes a polypeptide set forth in SEQ ID NO:22 or a fragment thereof,such as an oligonucleotide of at least 5, 10, 12, 15, 30, 38, 50, 68,100, 150, 200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750,800, 850, 900, 950, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700,1800, 1900, 2000, 2100, 2200, 2300, 2400, 2500, 2600, 2700, 2800, 2847nucleotides in length for SEQ ID NO:21 and sufficient to specificallyhybridize under stringent conditions to mRNA or DNA. The nucleic acidprobe can be, for example, a full-length cDNA that encodes a polypeptideset forth in SEQ ID NO:25 or a fragment thereof, such as anoligonucleotide of at least 5, 10, 12, 15, 30, 38, 50, 68, 100, 150,200, 250, 300, 350, 400, 450, 500, 550, 600, 650, 700, 750, 800, 850,900, 950, 1000, 1100, 1200, 1300, 1400, 1500, 1600, 1700, 1800, 1900,2000, 2100, 2200, 2300, 2400, 2500, 2600, 2700, 2800, 2900, 3000, 3100,3200, 3300, or 3391 nucleotides in length for SEQ ID NO:24 andsufficient to specifically hybridize under stringent conditions to mRNAor DNA.

Fragments of the polynucleotides' described herein are also useful tosynthesize larger fragments or full-length polynucleotides describedherein. For example, a fragment can be hybridized to any portion of anmRNA and a larger or full-length cDNA can be produced.

The fragments are also useful to synthesize antisense molecules ofdesired length and sequence.

Antisense nucleic acids of the invention can be designed using anucleotide sequence of SEQ ID NO:21, 23, 24 or 26, and constructed usingchemical synthesis and enzymatic ligation reactions using proceduresknown in the art. For example, an antisense nucleic acid (e.g., anantisense oligonucleotide) can be chemically synthesized using naturallyoccurring nucleotides or variously modified nucleotides designed toincrease the biological stability of the molecules or to increase thephysical stability of the duplex formed between the antisense and sensenucleic acids, e.g., phosphorothioate derivatives and acridinesubstituted nucleotides can be used. Examples of modified nucleotideswhich can be used to generate the antisense nucleic acid include5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil,hypoxanthine, xanthine, 4-acetylcytosine, 5-(carboxyhydroxylmethyl)uracil, 5-carboxymethylaminomethyl-2-thiouridine,5-carboxymethylaminomethyluracil, dihydrouracil,beta-D-galactosylqueosine, inosine, N6-isopentenyladenine,1-methylguanine, 1-methylinosine, 2,2-dimethylguanine, 2-methyladenine,2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-adenine,7-methylguanine, 5-methylaminomethyluracil,5-methoxyaminomethyl-2-thiouracil, beta-D-mannosylqueosine,5′-methoxycarboxymethyluracil, 5-methoxyuracil,2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v),wybutoxosine, pseudouracil, queosine, 2-thiocytosine,5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-methyluracil,uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v),5-methyl-2-thiouracil, 3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w,and 2,6-diaminopurine. Alternatively, the antisense nucleic acid can beproduced biologically using an expression vector into which a nucleicacid has been subcloned in an antisense orientation (i.e., RNAtranscribed from the inserted nucleic acid will be of an antisenseorientation to a target nucleic acid of interest.

Additionally, the nucleic acid molecules of the invention can bemodified at the base moiety, sugar moiety or phosphate backbone toimprove, e.g., the stability, hybridization, or solubility of themolecule. For example, the deoxyribose phosphate backbone of the nucleicacids can be modified to generate peptide nucleic acids (see Hyrup etal. (1996) Bioorganic & Medicinal Chemistry 4:5). As used herein, theterms “peptide nucleic acids” or “PNAs” refer to nucleic acid mimics,e.g., DNA mimics, in which the deoxyribose phosphate backbone isreplaced by a pseudopeptide backbone and only the four naturalnucleobases are retained. The neutral backbone of PNAs has been shown toallow for specific hybridization to DNA and RNA under conditions of lowionic strength. The synthesis of PNA oligomers can be performed usingstandard solid phase peptide synthesis protocols as described in Hyrupet al. (1996), supra; Perry-O'Keefe et al. (1996) Proc. Natl. Acad. Sci.USA 93:14670. PNAs can be further modified, e.g., to enhance theirstability, specificity or cellular uptake, by attaching lipophilic orother helper groups to PNA, by the formation of PNA-DNA chimeras, or bythe use of liposomes or other techniques of drug delivery known in theart. The synthesis of PNA-DNA chimeras can be performed as described inHyrup (1996), supra, Finn et al. (1996) Nucleic Acids Res.24(17):3357-63, Mag et al. (1989) Nucleic Acids Res. 17:5973, andPeterser et al. (1975) Bioorganic Med. Chem. Lett. 5:1119.

The nucleic acid molecules and fragments of the invention can alsoinclude other appended groups such as peptides (e.g., for targeting hostcell 26651 or 26138 proteins in vivo), or agents facilitating transportacross the cell membrane (see, e.g., Letsinger et al. (1989) Proc. Natl.Acad. Sci. USA 86:6553-6556; Lemaitre et al. (1987) Proc. Natl. Acad.Sci. USA 84:648-652; PCT Publication No. WO 88/0918) or the blood brainbarrier (see, e.g., PCT Publication No. WO 89/10134). In addition,oligonucleotides can be modified with hybridization-triggered cleavageagents (see, e.g., Krol et al. (1988) Bio-Techniques 6:958-976) orintercalating agents (see, e.g., Zon (1988) Pharm Res. 5:539-549).

The polynucleotides are also useful as primers for PCR to amplify anygiven region of a polynucleotide of the invention.

The polynucleotides are also useful for constructing recombinantvectors. Such vectors include expression vectors that express a portionof, or all of, the GAP polypeptides of the invention. Vectors alsoinclude insertion vectors, used to integrate into another polynucleotidesequence, such as into the cellular genome, to alter in situ expressionof genes and gene products of the invention. For example, an endogenouscoding sequence can be replaced via homologous recombination with all orpart of the coding region containing one or more specifically introducedmutations.

The polynucleotides are also useful for expressing antigenic portions ofthe proteins of the invention.

The polynucleotides are also useful as probes for determining thechromosomal positions of the polynucleotides of the invention by meansof in situ hybridization methods, such as FISH (For a review of thistechnique, see Verma et al. (1988) Human Chromosomes: A Manual of BasicTechniques (Pergamon Press, New York)), and PCR mapping of somatic cellhybrids. The mapping of the sequences to chromosomes is an importantfirst step in correlating these sequences with genes associated withdisease.

Reagents for chromosome mapping can be used individually to mark asingle chromosome or a single site on that chromosome, or panels ofreagents can be used for marking multiple sites and/or multiplechromosomes. Reagents corresponding to noncoding regions of the genesactually are preferred for mapping purposes. Coding sequences are morelikely to be conserved within gene families, thus increasing the chanceof cross hybridizations during chromosomal mapping.

Once a sequence has been mapped to a precise chromosomal location, thephysical position of the sequence on the chromosome can be correlatedwith genetic map data. (Such data are found, for example, in MendelianInheritance in Man, V. McKusick, available on-line through Johns HopkinsUniversity Welch Medical Library). The relationship between a gene and adisease, mapped to the same chromosomal region, can then be identifiedthrough linkage analysis (co-inheritance of physically adjacent genes),described in, for example, Egeland et al. (1987) Nature 325:783-787. Thechromosomal location of 26138 on human chromosome 19 is indicated inFIG. 63.

Moreover, differences in the DNA sequences between individuals affectedand unaffected with a disease associated with a specified gene, can bedetermined. If a mutation is observed in some or all of the affectedindividuals but not in any unaffected individuals, then the mutation islikely to be the causative agent of the particular disease. Comparisonof affected and unaffected individuals generally involves first lookingfor structural alterations in the chromosomes, such as deletions ortranslocations that are visible form chromosome spreads or detectableusing PCR based on that DNA sequence. Ultimately, complete sequencing ofgenes from several individuals can be performed to confirm the presenceof a mutation and to distinguish mutations from polymorphisms.

The polynucleotide probes are also useful to determine patterns of thepresence of the gene encoding the proteins and their variants withrespect to tissue distribution, for example, whether gene duplicationhas occurred and whether the duplication occurs in all or only a subsetof tissues. The genes can be naturally occurring or can have beenintroduced into a cell, tissue, or organism exogenously.

The polynucleotides are also useful for designing ribozymescorresponding to all, or a part, of the mRNA produced from genesencoding the polynucleotides described herein.

The polynucleotides are also useful for constructing host cellsexpressing a part, or all, of the polynucleotides and polypeptides.

The polynucleotides are also useful for constructing transgenic animalsexpressing all, or a part, of the polynucleotides and polypeptides.

The polynucleotides are also useful for making vectors that expresspart, or all, of the polypeptides.

The polynucleotides are also useful as hybridization probes fordetermining the level of nucleic acid expression of a sequence of theinvention. Accordingly, the probes can be used to detect the presenceof, or to determine levels of, a nucleic acid molecule of the inventionin cells, tissues, and in organisms. The nucleic acid whose level isdetermined can be DNA or RNA. Accordingly, probes corresponding to thepolypeptides described herein can be used to assess gene copy number ina given cell, tissue, or organism. This is particularly relevant incases in which there has been an amplification of a gene of theinvention.

Alternatively, the probe can be used in an in situ hybridization contextto assess the position of extra copies of a gene of the invention, as onextrachromosomal elements or as integrated into chromosomes in which thegene is not normally found, for example as a homogeneously stainingregion.

These uses are relevant for diagnosis of disorders involving an increaseor decrease in expression relative to normal, such as a proliferativedisorder, a differentiative or developmental disorder, a hematopoieticdisorder or a viral disorder, especially as disclosed herein.

Thus, the present invention provides a method for identifying a diseaseor disorder associated with aberrant expression or activity of a nucleicacid of the invention, in which a test sample is obtained from a subjectand nucleic acid (e.g., mRNA, genomic DNA) is detected, wherein thepresence of the nucleic acid is diagnostic for a subject having or atrisk of developing a disease or disorder associated with aberrantexpression or activity of the nucleic acid. “Misexpression or aberrantexpression”, as used herein, refers to a non-wild type pattern of geneexpression, at the RNA or protein level. It includes: expression atnon-wild type levels, i.e., over or under expression; a pattern ofexpression that differs from wild type in terms of the time or stage atwhich the gene is expressed, e.g., increased or decreased expression (ascompared with wild type) at a predetermined developmental period orstage; a pattern of expression that differs from wild type in terms ofdecreased expression (as compared with wild type) in a predeterminedcell type or tissue type; a pattern of expression that differs from wildtype in terms of the splicing size, amino acid sequence,post-transitional modification, or biological activity of the expressedpolypeptide; a pattern of expression that differs from wild type interms of the effect of an environmental stimulus or extracellularstimulus on expression of the gene, e.g., a pattern of increased ordecreased expression (as compared with wild type) in the presence of anincrease or decrease in the strength of the stimulus.

One aspect of the invention relates to diagnostic assays for determiningnucleic acid expression as well as activity in the context of abiological sample (e.g., blood, serum, cells, tissue) to determinewhether an individual has a disease or disorder, or is at risk ofdeveloping a disease or disorder, associated with aberrant nucleic acidexpression or activity. Such assays can be used for prognostic orpredictive purpose to thereby prophylactically treat an individual priorto the onset of a disorder characterized by or associated withexpression or activity of the nucleic acid molecules.

In vitro techniques for detection of mRNA include Northernhybridizations and in situ hybridizations. In vitro techniques fordetecting DNA includes Southern hybridizations and in situhybridization.

Probes can be used as a part of a diagnostic test kit for identifyingcells or tissues that express a protein of the invention, such as bymeasuring the level of a nucleic acid encoding the protein in a sampleof cells from a subject e.g., mRNA or genomic DNA, or determining if thegene has been mutated.

Nucleic acid expression assays are useful for drug screening to identifycompounds that modulate expression of a nucleic acid of the invention(e.g., antisense, polypeptides, peptidomimetics, small molecules orother drugs). A cell is contacted with a candidate compound and theexpression of mRNA determined. The level of expression of an mRNA of theinvention in the presence of the candidate compound is compared to thelevel of expression of the mRNA in the absence of the candidatecompound. The candidate compound can then be identified as a modulatorof nucleic acid expression based on this comparison and be used, forexample to treat a disorder characterized by aberrant nucleic acidexpression. The modulator can bind to the nucleic acid or indirectlymodulate expression, such as by interacting with other cellularcomponents that affect nucleic acid expression.

Modulatory methods can be performed in vitro (e.g., by culturing thecell with the agent) or, alternatively, in vivo (e.g., by administeringthe agent to a subject) in patients or in transgenic animals.

The invention thus provides a method for identifying a compound that canbe used to treat a disorder associated with nucleic acid expression of agene of the invention. The method typically includes assaying theability of the compound to modulate the expression of a nucleic acid ofthe invention and thus identifying a compound that can be used to treata disorder characterized by undesired expression of a nucleic acid ofthe invention.

The assays can be performed in cell-based and cell-free systems.Cell-based assays include cells naturally expressing a nucleic acid ofthe invention, such as discussed herein above, or recombinant cellsgenetically engineered to express specific nucleic acid sequences.

Alternatively, candidate compounds can be assayed in vivo in patients orin transgenic animals.

The assay for expression of a nucleic acid of the invention can involvedirect assay of nucleic acid levels, such as mRNA levels, or oncollateral compounds involved in GAP function. Further, the expressionof genes that are up- or down-regulated in response to GAP activity, asin a signal pathway (such as cyclic AMP or phosphatidylinositolturnover) can also be assayed. In this embodiment the regulatory regionsof these genes can be operably linked to a reporter gene such asluciferase.

Thus, modulators of gene expression can be identified in a methodwherein a cell is contacted with a candidate compound and the expressionof mRNA determined. The level of expression of mRNA in the presence ofthe candidate compound is compared to the level of expression of mRNA inthe absence of the candidate compound. The candidate compound can thenbe identified as a modulator of nucleic acid expression based on thiscomparison and be used, for example to treat a disorder characterized byaberrant nucleic acid expression. When expression of mRNA isstatistically significantly greater in the presence of the candidatecompound than in its absence, the candidate compound is identified as astimulator of nucleic acid expression. When nucleic acid expression isstatistically significantly less in the presence of the candidatecompound than in its absence, the candidate compound is identified as aninhibitor of nucleic acid expression.

Accordingly, the invention provides methods of treatment, with thenucleic acid as a target, using a compound identified through drugscreening as a gene modulator to modulate nucleic acid expression.Modulation includes both up-regulation (i.e. activation or agonization)or down-regulation (suppression or antagonization) or effects on nucleicacid activity (e.g. when nucleic acid is mutated or improperly modified)Treatment is of disorders characterized by aberrant expression oractivity of the nucleic acid.

Alternatively, a modulator for nucleic acid expression can be a smallmolecule or drug identified using the screening assays described hereinas long as the drug or small molecule inhibits the nucleic acidexpression.

The polynucleotides are also useful for monitoring the effectiveness ofmodulating compounds on the expression or activity of the gene inclinical trials or in a treatment regimen. Thus, the gene expressionpattern can serve as a barometer for the continuing effectiveness oftreatment with the compound, particularly with compounds to which apatient can develop resistance. The gene expression pattern can alsoserve as a marker indicative of a physiological response of the affectedcells to the compound. Accordingly, such monitoring would allow eitherincreased administration of the compound or the administration ofalternative compounds to which the patient has not become resistant.Similarly, if the level of nucleic acid expression falls below adesirable level, administration of the compound could be commensuratelydecreased.

Monitoring can be, for example, as follows: (i) obtaining apre-administration sample from a subject prior to administration of theagent; (ii) detecting the level of expression of a specified mRNA orgenomic DNA of the invention in the pre-administration sample; (iii)obtaining one or more post-administration samples from the subject; (iv)detecting the level of expression or activity of the mRNA or genomic DNAin the post-administration samples; (v) comparing the level ofexpression or activity of the mRNA or genomic DNA in thepre-administration sample with the mRNA or genomic DNA in thepost-administration sample or samples; and (vi) increasing or decreasingthe administration of the agent to the subject accordingly.

The polynucleotides are also useful in diagnostic assays for qualitativechanges in a nucleic acid of the invention, and particularly inqualitative changes that lead to pathology. The polynucleotides can beused to detect mutations in genes of the invention and gene expressionproducts such as mRNA. The polynucleotides can be used as hybridizationprobes to detect naturally-occurring genetic mutations in a gene of theinvention and thereby to determine whether a subject with the mutationis at risk for a disorder caused by the mutation. Mutations includedeletion, addition, or substitution of one or more nucleotides in thegene, chromosomal rearrangement, such as inversion or transposition,modification of genomic DNA, such as aberrant methylation patterns orchanges in gene copy number, such as amplification. Detection of amutated form of the gene associated with a dysfunction provides adiagnostic tool for an active disease or susceptibility to disease whenthe disease results from overexpression, underexpression, or alteredexpression of a protein of the invention.

Mutations in the gene can be detected at the nucleic acid level by avariety of techniques. Genomic DNA can be analyzed directly or can beamplified by using PCR prior to analysis. RNA or cDNA can be used in thesame way.

In certain embodiments, detection of the mutation involves the use of aprobe/primer in a polymerase chain reaction (PCR) (see, e.g. U.S. Pat.Nos. 4,683,195 and 4,683,202), such as anchor PCR or Race PCR, or,alternatively, in a ligation chain reaction (LCR) (see, e.g., Landegranet al., Science 241:1077-1080 (1988); and Nakazawa et al., PNAS91:360-364 (1994)), the latter of which can be particularly useful fordetecting point mutations in the gene (see Abravaya et al., NucleicAcids Res. 23:675-682 (1995)). This method can include the steps ofcollecting a sample of cells from a patient, isolating nucleic acid(e.g., genomic, mRNA or both) from the cells of the sample, contactingthe nucleic acid sample with one or more primers which specificallyhybridize to a gene under conditions such that hybridization andamplification of the gene (if present) occurs, and detecting thepresence or absence of an amplification product, or detecting the sizeof the amplification product and comparing the length to a controlsample. Deletions and insertions can be detected by a change in size ofthe amplified product compared to the normal genotype. Point mutationscan be identified by hybridizing amplified DNA to normal RNA orantisense DNA sequences.

It is anticipated that PCR and/or LCR may be desirable to use as apreliminary amplification step in conjunction with any of the techniquesused for detecting mutations described herein.

Alternative amplification methods include: self sustained sequencereplication (Guatelli et al. (1990) Proc. Natl. Acad. Sci. USA87:1874-1878), transcriptional amplification system (Kwoh et al. (1989)Proc. Natl. Acad. Sci. USA 86:1173-1177), Q-Beta Replicase (Lizardi etal. (1988) Bio/Technology 6:1197), or any other nucleic acidamplification method, followed by the detection of the amplifiedmolecules using techniques well-known to those of skill in the art.These detection schemes are especially useful for the detection ofnucleic acid molecules if such molecules are present in very lownumbers.

Alternatively, mutations in a gene of the invention can be directlyidentified, for example, by alterations in restriction enzyme digestionpatterns determined by gel electrophoresis.

Further, sequence-specific ribozymes (U.S. Pat. No. 5,498,531) can beused to score for the presence of specific mutations by development orloss of a ribozyme cleavage site.

Perfectly matched sequences can be distinguished from mismatchedsequences by nuclease cleavage digestion assays or by differences inmelting temperature.

Sequence changes at specific locations can also be assessed by nucleaseprotection assays such as RNase and S1 protection or the chemicalcleavage method.

Furthermore, sequence differences between a mutant gene of the inventionand the wild-type gene can be determined by direct DNA sequencing. Avariety of automated sequencing procedures can be utilized whenperforming the diagnostic assays ((1995) Biotechniques 19:448),including sequencing by mass spectrometry (see, e.g., PCT InternationalPublication No. WO 94/16101; Cohen et al., Adv. Chromatogr. 36:127-162(1996); and Griffin et al., Appl. Biochem. Biotechnol. 38:147-159(1993)).

Other methods for detecting mutations in the gene include methods inwhich protection from cleavage agents is used to detect mismatched basesin RNA/RNA or RNA/DNA duplexes (Myers et al., Science 230:1242 (1985));Cotton et al., PNAS 85:4397 (1988); Saleeba et al., Meth. Enzymol.217:286-295 (1992)), electrophoretic mobility of mutant and wild typenucleic acid is compared (Orita et al., PNAS 86:2766 (1989); Cotton etal., Mutat. Res. 285:125-144 (1993); and Hayashi et al, Genet. Anal.Tech. Appl. 9:73-79 (1992)), and movement of mutant or wild-typefragments in polyacrylamide gels containing a gradient of denaturant isassayed using denaturing gradient gel electrophoresis (Myers et al.,Nature 313:495 (1985)). The sensitivity of the assay may be enhanced byusing RNA (rather than DNA), in which the secondary structure is moresensitive to a change in sequence. In one embodiment, the subject methodutilizes heteroduplex analysis to separate double stranded heteroduplexmolecules on the basis of changes in electrophoretic mobility (Keen etal. (1991) Trends Genet. 7:5). Examples of other techniques fordetecting point mutations include, selective oligonucleotidehybridization, selective amplification, and selective primer extension.

In other embodiments, genetic mutations can be identified by hybridizinga sample and control nucleic acids, e.g., DNA or RNA, to high densityarrays containing hundreds or thousands of oligonucleotide probes(Cronin et al. (1996) Human Mutation 7:244-255; Kozal et al. (1996)Nature Medicine 2:753-759). For example, genetic mutations can beidentified in two dimensional arrays containing light-generated DNAprobes as described in Cronin et al. supra. Briefly, a firsthybridization array of probes can be used to scan through long stretchesof DNA in a sample and control to identify base changes between thesequences by making linear arrays of sequential overlapping probes. Thisstep allows the identification of point mutations. This step is followedby a second hybridization array that allows the characterization ofspecific mutations by using smaller, specialized probe arrayscomplementary to all variants or mutations detected. Each mutation arrayis composed of parallel probe sets, one complementary to the wild-typegene and the other complementary to the mutant gene.

The polynucleotides are also useful for testing an individual for agenotype that while not necessarily causing the disease, neverthelessaffects the treatment modality. Thus, the polynucleotides can be used tostudy the relationship between an individual's genotype and theindividual's response to a compound used for treatment (pharmacogenomicrelationship). In the present case, for example, a mutation in the genethat results in altered affinity for a GTPase or an effector molecule(or analog) could result in an excessive or decreased drug effect withstandard concentrations of GTPase, or effector (or analog). Accordingly,the polynucleotides described herein can be used to assess the mutationcontent of the gene in an individual in order to select an appropriatecompound or dosage regimen for treatment.

Thus polynucleotides displaying genetic variations that affect treatmentprovide a diagnostic target that can be used to tailor treatment in anindividual. Accordingly, the production of recombinant cells and animalscontaining these polymorphisms allow effective clinical design oftreatment compounds and dosage regimens.

The methods can involve obtaining a control biological sample from acontrol subject, contacting the control sample with a compound or agentcapable of detecting mRNA, or genomic DNA, such that the presence ofmRNA or genomic DNA is detected in the biological sample, and comparingthe presence of mRNA or genomic DNA in the control-sample with thepresence of mRNA or genomic DNA in the test sample.

The polynucleotides are also useful for chromosome identification whenthe sequence is identified with an individual chromosome and to aparticular location on the chromosome. First, the DNA sequence ismatched to the chromosome by in situ or other chromosome-specifichybridization. Sequences can also be correlated to specific chromosomesby preparing PCR primers that can be used for PCR screening of somaticcell hybrids containing individual chromosomes from the desired species.Only hybrids containing the chromosome containing the gene homologous tothe primer will yield an amplified fragment. Sublocalization can beachieved using chromosomal fragments. Other strategies includeprescreening with labeled flow-sorted chromosomes and preselection byhybridization to chromosome-specific libraries. Further mappingstrategies include fluorescence in situ hybridization which allowshybridization with probes shorter than those traditionally used.Reagents for chromosome mapping can be used individually to mark asingle chromosome or a single site on the chromosome, or panels ofreagents can be used for marking multiple sites and/or multiplechromosomes. Reagents corresponding to noncoding regions of the genesactually are preferred for mapping purposes. Coding sequences are morelikely to be conserved within gene families, thus increasing the chanceof cross hybridizations during chromosomal mapping.

The polynucleotides can also be used to identify individuals from smallbiological samples. This can be done for example using restrictionfragment-length polymorphism (RFLP) to identify an individual. Thus, thepolynucleotides described herein are useful as DNA markers for RFLP (SeeU.S. Pat. No. 5,272,057).

Furthermore, the sequence can be used to provide an alternativetechnique which determines the actual DNA sequence of selected fragmentsin the genome of an individual. Thus, the sequences described herein canbe used to prepare two PCR primers from the 5′ and 3′ ends of thesequences. These primers can then be used to amplify DNA from anindividual for subsequent sequencing.

Panels of corresponding DNA sequences from individuals prepared in thismanner can provide unique individual identifications, as each individualwill have a unique set of such DNA sequences. It is estimated thatallelic variation in humans occurs with a frequency of about once pereach 500 bases. Allelic variation occurs to some degree in the codingregions of these sequences, and to a greater degree in the noncodingregions. The sequences can be used to obtain such identificationsequences from individuals and from tissue. The sequences representunique fragments of the human genome. Each of the sequences describedherein can, to some degree, be used as a standard against which DNA froman individual can be compared for identification purposes.

If a panel of reagents from the sequences is used to generate a uniqueidentification database for an individual, those same reagents can laterbe used to identify tissue from that individual. Using the uniqueidentification database, positive identification of the individual,living or dead, can be made from extremely small tissue samples.

The polynucleotides can also be used in forensic identificationprocedures. PCR technology can be used to amplify DNA sequences takenfrom very small biological samples, such as a single hair follicle, bodyfluids (e.g. blood, saliva, or semen). The amplified sequence can thenbe compared to a standard allowing identification of the origin of thesample.

The polynucleotides can thus be used to provide polynucleotide reagents,e.g., PCR primers, targeted to specific loci in the human genome, whichcan enhance the reliability of DNA-based forensic identifications by,for example, providing another “identification marker” (i.e. another DNAsequence that is unique to a particular individual). As described above,actual base sequence information can be used for identification as anaccurate alternative to patterns formed by restriction enzyme generatedfragments. Sequences targeted to the noncoding region are particularlyuseful since greater polymorphism occurs in the noncoding regions,making it easier to differentiate individuals using this technique.

The polynucleotides can further be used to provide polynucleotidereagents, e.g., labeled or labelable probes which can be used in, forexample, an in situ hybridization technique, to identify a specifictissue. This is useful in cases in which a forensic pathologist ispresented with a tissue of unknown origin. Panels of probes can be usedto identify tissue by species and/or by organ type.

In a similar fashion, these primers and probes can be used to screentissue culture for contamination (i.e. screen for the presence of amixture of different types of cells in a culture).

Alternatively, the polynucleotides can be used directly to blocktranscription or translation of nucleic acid sequences of the inventionby means of antisense or ribozyme constructs. Thus, in a disordercharacterized by abnormally high or undesirable expression of a gene ofthe invention, nucleic acids can be directly used for treatment.

The polynucleotides are thus useful as antisense constructs to controlexpression of a gene of the invention in cells, tissues, and organisms.A DNA antisense polynucleotide is designed to be complementary to aregion of the gene involved in transcription, preventing transcriptionand hence production of protein. An antisense RNA or DNA polynucleotidewould hybridize to the mRNA and thus block translation of mRNA intoprotein.

Examples of antisense molecules useful to inhibit nucleic acidexpression include antisense molecules complementary to a fragment ofthe 5′ untranslated region of a sequence of SEQ ID NO:21 or SEQ ID NO:24which also includes the start codon and antisense molecules which arecomplementary to a fragment of the 3′ untranslated region of thesequence.

Alternatively, a class of antisense molecules can be used to inactivatemRNA in order to decrease expression of a nucleic acid of the invention.Accordingly, these molecules can treat a disorder characterized byabnormal or undesired expression of a nucleic acid of the invention.This technique involves cleavage by means of ribozymes containingnucleotide sequences complementary to one or more regions in the mRNAthat attenuate the ability of the mRNA to be translated. Possibleregions include coding regions and particularly coding regionscorresponding to the catalytic and other functional activities of theprotein, such as GTPase binding. It is understood that these regionsinclude any of those specific domains, sites, segments, motifs, and thelike that are disclosed as specific regions or sites herein.

The polynucleotides also provide vectors for gene therapy in patientscontaining cells that are aberrant in expression of a gene of theinvention. Thus, recombinant cells, which include the patient's cellsthat have been engineered ex vivo and returned to the patient, areintroduced into an individual where the cells produce the desiredprotein to treat the individual.

The invention also encompasses kits for detecting the presence of anucleic acid of the invention in a biological sample. For example, thekit can comprise reagents such as a labeled or labelable nucleic acid oragent capable of detecting the nucleic acid in a biological sample;means for determining the amount of the nucleic acid in the sample; andmeans for comparing the amount of the nucleic acid in the sample with astandard. The compound or agent can be packaged in a suitable container.The kit can further comprise instructions for using the kit to detect amRNA or DNA of the invention.

Computer Readable Means

The nucleotide or amino acid sequences of the invention are alsoprovided in a variety of mediums to facilitate use thereof. As usedherein, “provided” refers to a manufacture, other than an isolatednucleic acid or amino acid molecule, which contains a nucleotide oramino acid sequence of the present invention. Such a manufactureprovides the nucleotide or amino acid sequences, or a subset thereof(e.g., a subset of open reading frames (ORFs)) in a form which allows askilled artisan to examine the manufacture using means not directlyapplicable to examining the nucleotide or amino acid sequences, or asubset thereof, as they exists in nature or in purified form.

In one application of this embodiment, a nucleotide or amino acidsequence of the present invention can be recorded on computer readablemedia. As used herein, “computer readable media” refers to any mediumthat can be read and accessed directly by a computer. Such mediainclude, but are not limited to: magnetic storage media, such as floppydiscs, hard disc storage medium, and magnetic tape; optical storagemedia such as CD-ROM; electrical storage media such as RAM and ROM; andhybrids of these categories such as magnetic/optical storage media. Theskilled artisan will readily appreciate how any of the presently knowncomputer readable mediums can be used to create a manufacture comprisingcomputer readable medium having recorded thereon a nucleotide or aminoacid sequence of the present invention.

As used herein, “recorded” refers to a process for storing informationon computer readable medium. The skilled artisan can readily adopt anyof the presently known methods for recording information on computerreadable medium to generate manufactures comprising the nucleotide oramino acid sequence information of the present invention.

A variety of data storage structures are available to a skilled artisanfor creating a computer readable medium having recorded thereon anucleotide or amino acid sequence of the present invention. The choiceof the data storage structure will generally be based on the meanschosen to access the stored information. In addition, a variety of dataprocessor programs and formats can be used to store the nucleotidesequence information of the present invention on computer readablemedium. The sequence information can be represented in a word processingtext file, formatted in commercially-available software such asWordPerfect and MicroSoft Word, or represented in the form of an ASCIIfile, stored in a database application, such as DB2, Sybase, Oracle, orthe like. The skilled artisan can readily adapt any number ofdataprocessor structuring formats (e.g., text file or database) in orderto obtain computer readable medium having recorded thereon thenucleotide sequence information of the present invention.

By providing the nucleotide or amino acid sequences of the invention incomputer readable form, the skilled artisan can routinely access thesequence information for a variety of purposes. For example, one skilledin the art can use the nucleotide or amino acid sequences of theinvention in computer readable form to compare a target sequence ortarget structural motif with the sequence information stored within thedata storage means. Search means are used to identify fragments orregions of the sequences of the invention which match a particulartarget sequence or target motif.

As used herein, a “target sequence” can be any DNA or amino acidsequence of six or more nucleotides or two or more amino acids. Askilled artisan can readily recognize that the longer a target sequenceis, the less likely a target sequence will be present as a randomoccurrence in the database. The most preferred sequence length of atarget sequence is from about 10 to 100 amino acids or from about 30 to300 nucleotide residues. However, it is well recognized thatcommercially important fragments, such as sequence fragments involved ingene expression and protein processing, may be of shorter length.

As used herein, “a target structural motif,” or “target motif,” refersto any rationally selected sequence or combination of sequences in whichthe sequence(s) are chosen based on a three-dimensional configurationwhich is formed upon the folding of the target motif. There are avariety of target motifs known in the art. Protein target motifsinclude, but are not limited to, enzyme active sites and signalsequences. Nucleic acid target motifs include, but are not limited to,promoter sequences, hairpin structures and inducible expression elements(protein binding sequences).

Computer software is publicly available which allows a skilled artisanto access sequence information provided in a computer readable mediumfor analysis and comparison to other sequences. A variety of knownalgorithms are disclosed publicly and a variety of commerciallyavailable software for conducting search means are and can be used inthe computer-based systems of the present invention. Examples of suchsoftware include, but are not limited to, MacPattern (EMBL), BLASTN andBLASTX (NCBIA).

For example, software which implements the BLAST (Altschul et al. (1990)J. Mol. Biol. 215:403-410) and BLAZE (Brutlag et al. (1993) Comp. Chem.17:203-207) search algorithms on a Sybase system can be used to identifyopen reading frames (ORFs) of the sequences of the invention whichcontain homology to ORFs or proteins from other libraries. Such ORFs areprotein encoding fragments and are useful in producing commerciallyimportant proteins such as enzymes used in various reactions and in theproduction of commercially useful metabolites.

Vectors/Host Cells

The invention also provides vectors containing the polynucleotides ofthe invention. The term “vector” refers to a vehicle, preferably anucleic acid molecule, that can transport the polynucleotides. When thevector is a nucleic acid molecule, the polynucleotides are covalentlylinked to the vector nucleic acid. With this aspect of the invention,the vector includes a plasmid, single or double stranded phage, a singleor double stranded RNA or DNA viral vector, or artificial chromosome,such as a BAC, PAC, YAC, OR MAC.

A vector can be maintained in the host cell as an extrachromosomalelement where it replicates and produces additional copies of thepolynucleotides. Alternatively, the vector may integrate into the hostcell genome and produce additional copies of the polynucleotides whenthe host cell replicates.

The invention provides vectors for the maintenance (cloning vectors) orvectors for expression (expression vectors) of the polynucleotides. Thevectors can function in prokaryotic or eukaryotic cells or in both(shuttle vectors).

Expression vectors contain cis-acting regulatory regions that areoperably linked in the vector to the polynucleotides such thattranscription of the polynucleotides is allowed in a host cell. Thepolynucleotides can be introduced into the host cell with a separatepolynucleotide capable of affecting transcription. Thus, the secondpolynucleotide may provide a trans-acting factor interacting with thecis-regulatory control region to allow transcription of thepolynucleotides from the vector. Alternatively, a trans-acting factormay be supplied by the host cell. Finally, a trans-acting factor can beproduced from the vector itself.

It is understood, however, that in some embodiments, transcriptionand/or translation of the polynucleotides can occur in a cell-freesystem.

The regulatory sequences to which the polynucleotides described hereincan be operably linked include promoters for directing mRNAtranscription. These include, but are not limited to, the left promoterfrom bacteriophage λ, the lac, TRP, and TAC promoters from E. coli, theearly and late promoters from SV40, the CMV immediate early promoter,the adenovirus early and late promoters, and retrovirus long-terminalrepeats.

In addition to control regions that promote transcription, expressionvectors may also include regions that modulate transcription, such asrepressor binding sites and enhancers. Examples include the SV40enhancer, the cytomegalovirus immediate early enhancer, polyomaenhancer, adenovirus enhancers, and retrovirus LTR enhancers.

In addition to containing sites for transcription initiation andcontrol, expression vectors can also contain sequences necessary fortranscription termination and, in the transcribed region a ribosomebinding site for translation. Other regulatory control elements forexpression include initiation and termination codons as well aspolyadenylation signals. The person of ordinary skill in the art wouldbe aware of the numerous regulatory sequences that are useful inexpression vectors. Such regulatory sequences are described, forexample, in Sambrook et al., Molecular Cloning: A Laboratory Manual.2nd. ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.,(1989).

A variety of expression vectors can be used to express a polynucleotideof the invention. Such vectors include chromosomal, episomal, andvirus-derived vectors, for example vectors derived from bacterialplasmids, from bacteriophage, from yeast episomes, from yeastchromosomal elements, including yeast artificial chromosomes, fromviruses such as baculoviruses, papovaviruses such as SV40, Vacciniaviruses, adenoviruses, poxviruses, pseudorabies viruses, andretroviruses. Vectors may also be derived from combinations of thesesources such as those derived from plasmid and bacteriophage geneticelements, e.g., cosmids and phagemids. Appropriate cloning andexpression vectors for prokaryotic and eukaryotic hosts are described inSambrook et al., Molecular Cloning: A Laboratory Manual. 2nd. ed., ColdSpring Harbor Laboratory Press, Cold Spring Harbor, N.Y., (1989).

The regulatory sequence may provide constitutive expression in one ormore host cells (i.e., tissue specific) or may provide for inducibleexpression in one or more cell types such as by temperature, nutrientadditive, or exogenous factor such as a hormone or other ligand. Avariety of vectors providing for constitutive and inducible expressionin prokaryotic and eukaryotic hosts are well known to those of ordinaryskill in the art.

The polynucleotides can be inserted into the vector nucleic acid bywell-known methodology. Generally, the DNA sequence that will ultimatelybe expressed is joined to an expression vector by cleaving the DNAsequence and the expression vector with one or more restriction enzymesand then ligating the fragments together. Procedures for restrictionenzyme digestion and ligation are well known to those of ordinary skillin the art.

The vector containing the appropriate polynucleotide can be introducedinto an appropriate host cell for propagation or expression usingwell-known techniques. Bacterial cells include, but are not limited to,E. coli, Streptomyces, and Salmonella typhimurium. Eukaryotic cellsinclude, but are not limited to, yeast, insect cells such as Drosophila,animal cells such as COS and CHO cells, and plant cells.

As described herein, it may be desirable to express the polypeptide as afusion protein. Accordingly, the invention provides fusion vectors thatallow for the production of the polypeptides. Fusion vectors canincrease the expression of a recombinant protein, increase thesolubility of the recombinant protein, and aid in the purification ofthe protein by acting for example as a ligand for affinity purification.A proteolytic cleavage site may be introduced at the junction of thefusion moiety so that the desired polypeptide can ultimately beseparated from the fusion moiety. Proteolytic enzymes include, but arenot limited to, factor Xa, thrombin, and enterokinase. Typical fusionexpression vectors include pGEX (Smith et al., Gene 67:31-40 (1988)),pMAL (New England Biolabs, Beverly, Mass.) and pRIT5 (Pharmacia,Piscataway, N.J.) which fuse glutathione S-transferase (GST), maltose Ebinding protein, or protein A, respectively, to the target recombinantprotein. Examples of suitable inducible non-fusion E. coli expressionvectors include pTrc (Amann et al., Gene 69:301-315 (1988)) and pET 11d(Studier et al., Gene Expression Technology: Methods in Enzymology185:60-89 (1990)).

Recombinant protein expression can be maximized in a host bacteria byproviding a genetic background wherein the host cell has an impairedcapacity to proteolytically cleave the recombinant protein. (Gottesman,S., Gene Expression Technology: Methods in Enzymology 185, AcademicPress, San Diego, Calif. (1990) 119-128). Alternatively, the sequence ofthe polynucleotide of interest can be altered to provide preferentialcodon usage for a specific host cell, for example E. coli. (Wada et al.,Nucleic Acids Res. 20:2111-2118 (1992)).

The polynucleotides can also be expressed by expression vectors that areoperative in yeast. Examples of vectors for expression in yeast e.g., S.cerevisiae include pYepSec1 (Baldari, et al., EMBO J. 6:229-234 (1987)),pMFa (Kurjan et al., Cell 30:933-943(1982)), pJRY88 (Schultz et al.,Gene 54:113-123 (1987)), and pYES2 (Invitrogen Corporation, San Diego,Calif.).

The polynucleotides can also be expressed in insect cells using, forexample, baculovirus expression vectors. Baculovirus vectors availablefor expression of proteins in cultured insect cells (e.g., Sf9 cells)include the pAc series (Smith et al., Mol. Cell. Biol. 3:2156-2165(1983)) and the pVL series (Lucklow et al., Virology 170:31-39 (1989)).

In certain embodiments of the invention, the polynucleotides describedherein are expressed in mammalian cells using mammalian expressionvectors. Examples of mammalian expression vectors include pCDM8 (Seed,B. Nature 329:840 (1987)) and pMT2PC (Kaufman et al., EMBO J. 6:187-195(1987)).

The expression vectors listed herein are provided by way of example onlyof the well-known vectors available to those of ordinary skill in theart that would be useful to express the 26651 or 26138 polynucleotides.The person of ordinary skill in the art would be aware of other vectorssuitable for maintenance propagation or expression of thepolynucleotides described herein. These are found for example inSambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: ALaboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory, Cold SpringHarbor Laboratory Press, Cold Spring Harbor, N.Y., 1989.

The invention also encompasses vectors in which the nucleic acidsequences described herein are cloned into the vector in reverseorientation, but operably linked to a regulatory sequence that permitstranscription of antisense RNA. Thus, an antisense transcript can beproduced to all, or to a portion, of the polynucleotide sequencesdescribed herein, including both coding and non-coding regions.Expression of this antisense RNA is subject to each of the parametersdescribed above in relation to expression of the sense RNA (regulatorysequences, constitutive or inducible expression, tissue-specificexpression).

The invention also relates to recombinant host cells containing thevectors described herein. Host cells therefore include prokaryoticcells, lower eukaryotic cells such as yeast, other eukaryotic cells suchas insect cells, and higher eukaryotic cells such as mammalian cells.

The recombinant host cells are prepared by introducing the vectorconstructs described herein into the cells by techniques readilyavailable to the person of ordinary skill in the art. These include, butare not limited to, calcium phosphate transfection,DEAE-dextran-mediated transfection, cationic lipid-mediatedtransfection, electroporation, transduction, infection, lipofection, andother techniques such as those found in Sambrook, et al. (MolecularCloning: A Laboratory Manual. 2nd, ed., Cold Spring Harbor Laboratory,Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1989).

Host cells can contain more than one vector. Thus, different nucleotidesequences can be introduced on different vectors of the same cell.Similarly, the polynucleotides can be introduced either alone or withother polynucleotides that are not related to the polynucleotides suchas those providing trans-acting factors for expression vectors. Whenmore than one vector is introduced into a cell, the vectors can beintroduced independently, co-introduced or joined to the polynucleotidevector.

In the case of bacteriophage and viral vectors, these can be introducedinto cells as packaged or encapsulated virus by standard procedures forinfection and transduction. Viral vectors can be replication-competentor replication-defective. In the case in which viral replication isdefective, replication will occur in host cells providing functions thatcomplement the defects.

Vectors generally include selectable markers that enable the selectionof the subpopulation of cells that contain the recombinant vectorconstructs. The marker can be contained in the same vector that containsthe polynucleotides described herein or may be on a separate vector.Markers include tetracycline or ampicillin-resistance genes forprokaryotic host cells and dihydrofolate reductase or neomycinresistance for eukaryotic host cells. However, any marker that providesselection for a phenotypic trait will be effective.

While the mature proteins can be produced in bacteria, yeast, mammaliancells, and other cells under the control of the appropriate regulatorysequences, cell-free transcription and translation systems can also beused to produce these proteins using RNA derived from the DNA constructsdescribed herein.

Where secretion of the polypeptide is desired, appropriate secretionsignals are incorporated into the vector. The signal sequence can beendogenous to the polypeptides or heterologous to these polypeptides.

Where the polypeptide is not secreted into the medium, the protein canbe isolated from the host cell by standard disruption procedures,including freeze thaw, sonication, mechanical disruption, use of lysingagents and the like. The polypeptide can then be recovered and purifiedby well-known purification methods including ammonium sulfateprecipitation, acid extraction, anion or cationic exchangechromatography, phosphocellulose chromatography, hydrophobic-interactionchromatography, affinity chromatography, hydroxylapatite chromatography,lectin chromatography, or high performance liquid chromatography.

It is also understood that depending upon the host cell in recombinantproduction of the polypeptides described herein, the polypeptides canhave various glycosylation patterns, depending upon the cell, or maybenon-glycosylated as when produced in bacteria. In addition, thepolypeptides may include an initial modified methionine in some cases asa result of a host-mediated process.

Host cells of particular interest include those derived from the tissuesin which the 26651 polypeptides of the invention are expressed,including tonsil, spleen, fetal liver, adult liver, fibrotic liver,granulocytes, neutrophils, erythroid cells, adipose tissue, bone marrow,colon, lung, kidney, heart, lymphocyte, megakaryocytes, T-cells, and thetissues and cell lines shown in FIGS. 64A-64B.

Host cells of particular interest include those derived from the tissuesin which the 26138 polypeptides of the invention are expressed,including tonsil, spleen, fetal liver, adult liver, fibrotic liver,granulocytes, neutrophils, erythroid cells, adipose tissue, bone marrow,colon, lung, kidney, heart, lymphocyte, megakaryocytes, T-cells, and thetissues and cell lines shown in FIGS. 64A-64B.

Uses of Vectors and Host Cells

It is understood that “host cells” and “recombinant host cells” refernot only to the particular subject cell but also to the progeny orpotential progeny of such a cell. Because certain modifications mayoccur in succeeding generations due to either mutation or environmentalinfluences, such progeny may not, in fact, be identical to the parentcell, but are still included within the scope of the term as usedherein. A “purified preparation of cells”, as used herein, refers to, inthe case of plant or animal cells, an in vitro preparation of cells andnot an entire intact plant or animal. In the case of cultured cells ormicrobial cells, it consists of a preparation of at least 10% and morepreferably 50% of the subject cells.

The host cells expressing the polypeptides described herein, andparticularly recombinant host cells, have a variety of uses. First, thecells are useful for producing proteins or polypeptides of the inventionthat can be further purified to produce desired amounts of the proteinsor polypeptides, including fusion proteins or polypeptides, encoded bynucleic acids as described herein (e.g. 26651-like or 26138-likepolypeptides, mutant forms of 26651-like or 26138-like polypeptides,fusion proteins, etc.). It is further recognized that the nucleic acidsequences of the invention can be altered to contain codons, which arepreferred, or non-preferred, for a particular expression system. Forexample, the nucleic acid can be one in which at least one alteredcodon, and preferably at least 10% or 20% of the codons have beenaltered such that the sequence is optimized for expression in E. coli,yeast, human, insect, or CHO cells. Methods for determining codon usageare well known in the art. Thus, host cells containing expressionvectors are useful for polypeptide production.

Host cells are also useful for conducting cell-based assays involvingthe protein of the invention or fragments. Thus, a recombinant host cellexpressing the native protein is useful to assay for compounds thatstimulate or inhibit protein function. This can include GTPase binding,gene expression at the level of transcription or translation, effectorinteraction, and components of a signal transduction or other pathway.

Cells of particular relevance are those in which the protein isexpressed as disclosed herein.

Host cells are also useful for identifying mutants in which thesefunctions are affected. If the mutants naturally occur and give rise toa pathology, host cells containing the mutations are useful to assaycompounds that have a desired effect on the mutant protein (for example,stimulating or inhibiting function) which may not be indicated by theireffect on the native protein.

Recombinant host cells are also useful for expressing the chimericpolypeptides described herein to assess compounds that activate orsuppress activation by means of heterologous sites or domains, forexample, a binding region, on any given host cell.

Further, mutant proteins of the invention can be designed in which oneor more of the various functions is engineered to be increased ordecreased (e.g., GTPase binding) and used to augment or replace proteinsof the invention in an individual. Thus, host cells can provide atherapeutic benefit by replacing an aberrant protein or providing anaberrant protein that provides a therapeutic result. In one embodiment,the cells provide proteins that are abnormally active.

In another embodiment, the cells provide proteins that are abnormallyinactive. These can compete with the endogenous protein in theindividual.

In another embodiment, cells expressing the proteins that cannot beactivated, are introduced into an individual in order to compete withthe endogenous protein for GTPase. For example, in the case in whichexcessive GTPase (or analog) is part of a treatment modality, it may benecessary to inactivate the compound at a specific point in treatment.Providing cells that compete for the compound, but which cannot beaffected by protein activation would be beneficial.

Homologously recombinant host cells can also be produced that allow thein situ alteration of the endogenous polynucleotide sequences in a hostcell genome. The host cell includes, but is not limited to, a stablecell line, cell in vivo, or cloned microorganism. This technology ismore fully described in WO 93/09222, WO 91/12650, WO 91/06667, U.S. Pat.No. 5,272,071, and U.S. Pat. No. 5,641,670. Briefly, specificpolynucleotide sequences corresponding to the GAP polynucleotides orsequences proximal or distal to a GAP gene are allowed to integrate intoa host cell genome by homologous recombination where expression of thegene can be affected. In one embodiment, regulatory sequences areintroduced that either increase or decrease expression of an endogenoussequence. Accordingly, a GAP protein can be produced in a cell notnormally producing it. Alternatively, increased expression of GAPprotein can be effected in a cell normally producing the protein at aspecific level. Further, expression can be decreased or eliminated byintroducing a specific regulatory sequence. The regulatory sequence canbe heterologous to the protein sequence or can be a homologous sequencewith a desired mutation that affects expression. Alternatively, theentire-gene can be deleted. The regulatory sequence can be specific tothe host cell or capable of functioning in more than one cell type.Still further, specific mutations can be introduced into any desiredregion of the gene to produce mutant GAP proteins. Such mutations couldbe introduced, for example, into the specific functional regions such asthe ligand-binding site.

In one embodiment, the host cell can be a fertilized oocyte or embryonicstem cell that can be used to produce a transgenic animal containing thealtered gene of the invention. Alternatively, the host cell can be astem cell or other early tissue precursor that gives rise to a specificsubset of cells and can be used to produce transgenic tissues in ananimal. See also Thomas et al., Cell 51:503 (1987) for a description ofhomologous recombination vectors. The vector is introduced into anembryonic stem cell line (e.g., by electroporation) and cells in whichthe introduced gene has homologously recombined with the endogenous geneis selected (see e.g., Li, E. et al., Cell 69:915 (1992)). The selectedcells are then injected into a blastocyst of an animal (e.g., a mouse)to form aggregation chimeras (see e.g., Bradley, A. in Teratocarcinomasand Embryonic Stem Cells: A Practical Approach, E. J. Robertson, ed.(IRL, Oxford, 1987) pp. 113-152). A chimeric embryo can then beimplanted into a suitable pseudopregnant female foster animal and theembryo brought to term. Progeny harboring the homologously recombinedDNA in their germ cells can be used to breed animals in which all cellsof the animal contain the homologously recombined DNA by germlinetransmission of the transgene. Methods for constructing homologousrecombination vectors and homologous recombinant animals are describedfurther in Bradley, A. (1991) Current Opinions in Biotechnology2:823-829 and in PCT International Publication Nos. WO 90/11354; WO91/01140; and WO 93/04169.

The genetically engineered host cells can be used to produce non-humantransgenic animals. A transgenic animal is preferably a mammal, forexample a rodent, such as a rat or mouse, in which one or more of thecells of the animal include a transgene. A transgene is exogenous DNAwhich is integrated into the genome of a cell from which a transgenicanimal develops and which remains in the genome of the mature animal inone or more cell types or tissues of the transgenic animal. Theseanimals are useful for studying the function of a protein of theinvention and identifying and evaluating modulators of the proteinactivity.

Other examples of transgenic animals include non-human primates, sheep,dogs, cows, goats, chickens, and amphibians.

In one embodiment, a host cell is a fertilized oocyte or an embryonicstem cell into which polynucleotide sequences of the invention have beenintroduced.

A transgenic animal can be produced by introducing nucleic acid into themale pronuclei of a fertilized oocyte, e.g., by microinjection,retroviral infection, and allowing the oocyte to develop in apseudopregnant female foster animal. Any of the nucleotide sequences ofthe invention can be introduced as a transgene into the genome of anon-human animal, such as a mouse.

Any of the regulatory or other sequences useful in expression vectorscan form part of the transgenic sequence. This includes intronicsequences and polyadenylation signals, if not already included. Atissue-specific regulatory sequence(s) can be operably linked to thetransgene to direct expression of the protein to particular cells.

Methods for generating transgenic animals via embryo manipulation andmicroinjection, particularly animals such as mice, have becomeconventional in the art and are described, for example, in U.S. Pat.Nos. 4,736,866 and 4,870,009, both by Leder et al., U.S. Pat. No.4,873,191 by Wagner et al. and in Hogan, B., Manipulating the MouseEmbryo, (Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.,1986). Similar methods are used for production of other transgenicanimals. A transgenic founder animal can be identified based upon thepresence of the transgene in its genome and/or expression of transgenicmRNA in tissues or cells of the animals. A transgenic founder animal canthen be used to breed additional animals carrying the transgene.Moreover, transgenic animals carrying a transgene can further be bred toother transgenic animals carrying other transgenes. A transgenic animalalso includes animals in which the entire animal or tissues in theanimal have been produced using the homologously recombinant host cellsdescribed herein.

In another embodiment, transgenic non-human animals can be producedwhich contain selected systems which allow for regulated expression ofthe transgene. One example of such a system is the cre/loxP recombinasesystem of bacteriophage P1. For a description of the cre/loxPrecombinase system, see, e.g., Lakso et al. PNAS 89:6232-6236 (1992).Another example of a recombinase system is the FLP recombinase system ofS. cerevisiae (O'Gorman et al. Science 251:1351-1355 (1991). If acre/loxP recombinase system is used to regulate expression of thetransgene, animals containing transgenes encoding both the Crerecombinase and a selected protein is required. Such animals can beprovided through the construction of “double” transgenic animals, e.g.,by mating two transgenic animals, one containing a transgene encoding aselected protein and the other containing a transgene encoding arecombinase.

Clones of the non-human transgenic animals described herein can also beproduced according to the methods described in Wilmut, I. et al. Nature385:810-813 (1997) and PCT International Publication Nos. WO 97/07668and WO 97/07669. In brief, a cell, e.g., a somatic cell, from thetransgenic animal can be isolated and induced to exit the growth cycleand enter G₀ phase. The quiescent cell can then be fused, e.g., throughthe use of electrical pulses, to an enucleated oocyte from an animal ofthe same species from which the quiescent cell is isolated. Thereconstructed oocyte is then cultured such that it develops to morula orblastocyst and then transferred to a pseudopregnant female fosteranimal. The offspring born of this female foster animal will be a cloneof the animal from which the cell, e.g., the somatic cell, is isolated.

Transgenic animals containing recombinant cells that express thepolypeptides described herein are useful to conduct the assays describedherein in an in vivo context. Accordingly, the various physiologicalfactors that are present in vivo and that could affect GTPase binding oractivation, and signal transduction, may not be evident from in vitrocell-free or cell-based assays. Accordingly, it is useful to providenon-human transgenic animals to assay in vivo GAP function, includingGTPase interaction, the effect of specific mutant proteins on GAPfunction and GTPase interaction, and the effect of chimeric proteins. Itis also possible to assess the effect of null mutations, that ismutations that substantially or completely eliminate one or more proteinfunctions.

In general, methods for producing transgenic animals include introducinga nucleic acid sequence according to the present invention, the nucleicacid sequence capable of expressing the protein in a transgenic animal,into a cell in culture or in vivo. When introduced in vivo, the nucleicacid is introduced into an intact organism such that one or more celltypes and, accordingly, one or more tissue types, express the nucleicacid encoding the protein. Alternatively, the nucleic acid can beintroduced into virtually all cells in an organism by transfecting acell in culture, such as an embryonic stem cell, as described herein forthe production of transgenic animals, and this cell can be used toproduce an entire transgenic organism. As described, in a furtherembodiment, the host cell can be a fertilized oocyte. Such cells arethen allowed to develop in a female foster animal to produce thetransgenic organism.

Pharmaceutical Compositions

The nucleic acid molecules, proteins, modulators of the protein, andantibodies (also referred to herein as “active compounds”) can beincorporated into pharmaceutical compositions suitable foradministration to a subject, e.g., a human. Such compositions typicallycomprise the nucleic acid molecule, protein, modulator, or antibody anda pharmaceutically acceptable carrier.

As used herein the language “pharmaceutically acceptable carrier” isintended to include any and all solvents, dispersion media, coatings,antibacterial and antifungal agents, isotonic and absorption delayingagents, and the like, compatible with pharmaceutical administration. Theuse of such media and agents for pharmaceutically active substances iswell known in the art. Except insofar as any conventional media or agentis incompatible with the active compound, such media can be used in thecompositions of the invention. Supplementary active compounds can alsobe incorporated into the compositions. A pharmaceutical composition ofthe invention is formulated to be compatible with its intended route ofadministration. Examples of routes of administration include parenteral,e.g., intravenous, intradermal, subcutaneous, oral (e.g., inhalation),transdermal (topical), transmucosal, and rectal administration.Solutions or suspensions used for parenteral, intradermal, orsubcutaneous application can include the following components: a sterilediluent such as water for injection, saline solution, fixed oils,polyethylene glycols, glycerine, propylene glycol or other syntheticsolvents; antibacterial agents such as benzyl alcohol or methylparabens; antioxidants such as ascorbic acid or sodium bisulfite;chelating agents such as ethylenediaminetetraacetic acid; buffers suchas acetates, citrates or phosphates and agents for the adjustment oftonicity such as sodium chloride or dextrose. pH can be adjusted withacids or bases, such as hydrochloric acid or sodium hydroxide. Theparenteral preparation can be enclosed in ampoules, disposable syringesor multiple dose vials made of glass or plastic.

Pharmaceutical compositions suitable for injectable use include sterileaqueous solutions (where water soluble) or dispersions and sterilepowders for the extemporaneous preparation of sterile injectablesolutions or dispersion. For intravenous administration, suitablecarriers include physiological saline, bacteriostatic water, CremophorEL™ (BASF, Parsippany, N.J.) or phosphate buffered saline (PBS). In allcases, the composition must be sterile and should be fluid to the extentthat easy syringability exists. It must be stable under the conditionsof manufacture and storage and must be preserved against thecontaminating action of microorganisms such as bacteria and fungi. Thecarrier can be a solvent or dispersion medium containing, for example,water, ethanol, polyol (for example, glycerol, propylene glycol, andliquid polyethylene glycol, and the like), and suitable mixturesthereof. The proper fluidity can be maintained, for example, by the useof a coating such as lecithin, by the maintenance of the requiredparticle size in the case of dispersion and by the use of surfactants.Prevention of the action of microorganisms can be achieved by variousantibacterial and antifungal agents, for example, parabens,chlorobutanol, phenol, ascorbic acid, thimerosal, and the like. In manycases, it will be preferable to include isotonic agents, for example,sugars, polyalcohols such as mannitol, sorbitol, sodium chloride in thecomposition. Prolonged absorption of the injectable compositions can bebrought about by including in the composition an agent which delaysabsorption, for example, aluminum monostearate and gelatin.

Sterile injectable solutions can be prepared by incorporating the activecompound (e.g., a protein of the invention or antibody) in the requiredamount in an appropriate solvent with one or a combination ofingredients enumerated above, as required, followed by filteredsterilization. Generally, dispersions are prepared by incorporating theactive compound into a sterile vehicle which contains a basic dispersionmedium and the required other ingredients from those enumerated above.In the case of sterile powders for the preparation of sterile injectablesolutions, the preferred methods of preparation are vacuum drying andfreeze-drying which yields a powder of the active ingredient plus anyadditional desired ingredient from a previously sterile-filteredsolution thereof.

Oral compositions generally include an inert diluent or an ediblecarrier. They can be enclosed in gelatin capsules or compressed intotablets. For oral administration, the agent can be contained in entericforms to survive the stomach or further coated or mixed to be releasedin a particular region of the GI tract by known methods. For the purposeof oral therapeutic administration, the active compound can beincorporated with excipients and used in the form of tablets, troches,or capsules. Oral compositions can also be prepared using a fluidcarrier for use as a mouthwash, wherein the compound in the fluidcarrier is applied orally and swished and expectorated or swallowed.Pharmaceutically compatible binding agents, and/or adjuvant materialscan be included as part of the composition. The tablets, pills,capsules, troches and the like can contain any of the followingingredients, or compounds of a similar nature: a binder such asmicrocrystalline cellulose, gum tragacanth or gelatin; an excipient suchas starch or lactose, a disintegrating agent such as alginic acid,Primogel, or corn starch; a lubricant such as magnesium stearate orSterotes; a glidant such as colloidal silicon dioxide; a sweeteningagent such as sucrose or saccharin; or a flavoring agent such aspeppermint, methyl salicylate, or orange flavoring.

For administration by inhalation, the compounds are delivered in theform of an aerosol spray from pressured container or dispenser whichcontains a suitable propellant, e.g., a gas such as carbon dioxide, or anebulizer.

Systemic administration can also be by transmucosal or transdermalmeans. For transmucosal or transdermal administration, penetrantsappropriate to the barrier to be permeated are used in the formulation.Such penetrants are generally known in the art, and include, forexample, for transmucosal administration, detergents, bile salts, andfusidic acid derivatives. Transmucosal administration can beaccomplished through the use of nasal sprays or suppositories. Fortransdermal administration, the active compounds are formulated intoointments, salves, gels, or creams as generally known in the art.

The compounds can also be prepared in the form of suppositories (e.g.,with conventional suppository bases such as cocoa butter and otherglycerides) or retention enemas for rectal delivery.

In one embodiment, the active compounds are prepared with carriers thatwill protect the compound against rapid elimination from the body, suchas a controlled release formulation, including implants andmicroencapsulated delivery systems. Biodegradable, biocompatiblepolymers can be used, such as ethylene vinyl acetate, polyanhydrides,polyglycolic acid, collagen, polyorthoesters, and polylactic acid.Methods for preparation of such formulations will be apparent to thoseskilled in the art. The materials can also be obtained commercially fromAlza Corporation and Nova Pharmaceuticals, Inc. Liposomal suspensions(including liposomes targeted to infected cells with monoclonalantibodies to viral antigens) can also be used as pharmaceuticallyacceptable carriers. These can be prepared according to methods known tothose skilled in the art, for example, as described in U.S. Pat. No.4,522,811.

It is especially advantageous to formulate oral or parenteralcompositions in dosage unit form for ease of administration anduniformity of dosage. “Dosage unit form” as used herein refers tophysically discrete units suited as unitary dosages for the subject tobe treated; each unit containing a predetermined quantity of activecompound calculated to produce the desired therapeutic effect inassociation with the required pharmaceutical carrier. The specificationfor the dosage unit forms of the invention are dictated by and directlydependent on the unique characteristics of the active compound and theparticular therapeutic effect to be achieved, and the limitationsinherent in the art of compounding such an active compound for thetreatment of individuals.

The nucleic acid molecules of the invention can be inserted into vectorsand used as gene therapy vectors. Gene therapy vectors can be deliveredto a subject by, for example, intravenous injection, localadministration (U.S. Pat. No. 5,328,470) or by stereotactic injection(see e.g., Chen et al., PNAS 91:3054-3057 (1994)). The pharmaceuticalpreparation of the gene therapy vector can include the gene therapyvector in an acceptable diluent, or can comprise a slow release matrixin which the gene delivery vehicle is imbedded. Alternatively, where thecomplete gene delivery vector can be produced intact from recombinantcells, e.g. retroviral vectors, the pharmaceutical preparation caninclude one or more cells which produce the gene delivery system.

As defined herein, a therapeutically effective amount of protein orpolypeptide (i.e., an effective dosage) ranges from about 0.001 to 30mg/kg body weight, preferably about 0.01 to 25 mg/kg body weight, morepreferably about 0.1 to 20 mg/kg body weight, and even more preferablyabout 1 to 10 mg/kg, 2 to 9 mg/kg, 3 to 8 mg/kg, 4 to 7 mg/kg, or 5 to 6mg/kg body weight.

The skilled artisan will appreciate that certain factors may influencethe dosage required to effectively treat a subject, including but notlimited to the severity of the disease or disorder, previous treatments,the general health and/or age of the subject, and other diseasespresent. Moreover, treatment of a subject with a therapeuticallyeffective amount of a protein, polypeptide, or antibody can include asingle treatment or, preferably, can include a series of treatments. Ina preferred example, a subject is treated with antibody, protein, orpolypeptide in the range of between about 0.1 to 20 mg/kg body weight,one time per week for between about 1 to 10 weeks, preferably between 2to 8 weeks, more preferably between about 3 to 7 weeks, and even morepreferably for about 4, 5, or 6 weeks. It will also be appreciated thatthe effective dosage of antibody, protein, or polypeptide used fortreatment may increase or decrease over the course of a particulartreatment. Changes in dosage may result and become apparent from theresults of diagnostic assays as described herein.

The present invention encompasses agents which modulate expression oractivity. An agent may, for example, be a small molecule. For example,such small molecules include, but are not limited to, peptides,peptidomimetics, amino acids, amino acid analogs, polynucleotides,polynucleotide analogs, nucleotides, nucleotide analogs, organic orinorganic compounds (i.e., including heteroorganic and organometalliccompounds) having a molecular weight less than about 10,000 grams permole, organic or inorganic compounds having a molecular weight less thanabout 5,000 grams per mole, organic or inorganic compounds having amolecular weight less than about 1,000 grams per mole, organic orinorganic compounds having a molecular weight less than about 500 gramsper mole, and salts, esters, and other pharmaceutically acceptable formsof such compounds.

It is understood that appropriate doses of small molecule agents dependsupon a number of factors within the ken of the ordinarily skilledphysician, veterinarian, or researcher. The dose(s) of the smallmolecule will vary, for example, depending upon the identity, size, andcondition of the subject or sample being treated, further depending uponthe route by which the composition is to be administered, if applicable,and the effect which the practitioner desires the small molecule to haveupon the nucleic acid or polypeptide of the invention. Exemplary dosesinclude milligram or microgram amounts of the small molecule perkilogram of subject or sample weight (e.g., about 1 microgram perkilogram to about 500 milligrams per kilogram, about 100 micrograms perkilogram to about 5 milligrams per kilogram, or about 1 microgram perkilogram to about 50 micrograms per kilogram. It is furthermoreunderstood that appropriate doses of a small molecule depend upon thepotency of the small molecule with respect to the expression or activityto be modulated. Such appropriate doses may be determined using theassays described herein. When one or more of these small molecules is tobe administered to an animal (e.g., a human) in order to modulateexpression or activity of a polypeptide or nucleic acid of theinvention, a physician, veterinarian, or researcher may, for example,prescribe a relatively low dose at first, subsequently increasing thedose until an appropriate response is obtained. In addition, it isunderstood that the specific dose level for any particular animalsubject will depend upon a variety of factors including the activity ofthe specific compound employed, the age, body weight, general health,gender, and diet of the subject, the time of administration, the routeof administration, the rate of excretion, any drug combination, and thedegree of expression or activity to be modulated.

The pharmaceutical compositions can be included in a container, pack, ordispenser together with instructions for administration.

OTHER EMBODIMENTS

In another aspect, the invention features, a method of analyzing aplurality of capture probes. The method can be used, e.g., to analyzegene expression. The method includes: providing a two dimensional arrayhaving a plurality of addresses, each address of the plurality beingpositionally distinguishable from each other address of the plurality,and each address of the plurality having a unique capture probe, e.g., anucleic acid or peptide sequence; contacting the array with a 26651 or26138, preferably purified, nucleic acid, preferably purified,polypeptide, preferably purified, or antibody, and thereby evaluatingthe plurality of capture probes. Binding, e.g., in the case of a nucleicacid, hybridization with a capture probe at an address of the plurality,is detected, e.g., by signal generated from a label attached to the26651 or 26138 nucleic acid, polypeptide, or antibody.

The capture probes can be a set of nucleic acids from a selected sample,e.g., a sample of nucleic acids derived from a control or non-stimulatedtissue or cell.

The method can include contacting the 26651 or 26138 nucleic acid,polypeptide, or antibody with a first array having a plurality ofcapture probes and a second array having a different plurality ofcapture probes. The results of each hybridization can be compared, e.g.,to analyze differences in expression between a first and second sample.The first plurality of capture probes can be from a control sample,e.g., a wild type, normal, or non-diseased, non-stimulated, sample,e.g., a biological fluid, tissue, or cell sample. The second pluralityof capture probes can be from an experimental sample, e.g., a mutanttype, at risk, disease-state or disorder-state, or stimulated, sample,e.g., a biological fluid, tissue, or cell sample.

The plurality of capture probes can be a plurality of nucleic acidprobes each of which specifically hybridizes, with an allele of 26651 or26138. Such methods can be used to diagnose a subject, e.g., to evaluaterisk for a disease or disorder, to evaluate suitability of a selectedtreatment for a subject, to evaluate whether a subject has a disease ordisorder. 26651 or 26138 is associated with GAP activity, thus it isuseful for disorders associated with abnormal GTPase signaling, GTPaserelease of substrates, GTPase activation, or other GAP regulatedprocesses.

The method can be used to detect SNPs.

In another aspect, the invention features, a method of analyzing aplurality of probes. The method is useful, e.g., for analyzing geneexpression. The method includes: providing a two dimensional arrayhaving a plurality of addresses, each address of the plurality beingpositionally distinguishable from each other address of the pluralityhaving a unique capture probe, e.g., wherein the capture probes are froma cell or subject which express or mis express 26651 or 26138 or from acell or subject in which a 26651 or 26138 mediated response has beenelicited, e.g., by contact of the cell with 26651 or 26138 nucleic acidor protein, or administration to the cell or subject 26651 or 26138nucleic acid or protein; contacting the array with one or more inquiryprobe, wherein an inquiry probe can be a nucleic acid, polypeptide, orantibody (which is preferably other than 26651 or 26138 nucleic acid,polypeptide, or antibody); providing a two dimensional array having aplurality of addresses, each address of the plurality being positionallydistinguishable from each other address of the plurality, and eachaddress of the plurality having a unique capture probe, e.g., whereinthe capture probes are from a cell or subject which does not express26651 or 26138 (or does not express as highly as in the case of the26651 or 26138 positive plurality of capture probes) or from a cell orsubject which in which a 26651 or 26138 mediated response has not beenelicited (or has been elicited to a lesser extent than in the firstsample); contacting the array with one or more inquiry probes (which ispreferably other than a 26651 or 26138 nucleic acid, polypeptide, orantibody), and thereby evaluating the plurality of capture probes.Binding, e.g., in the case of a nucleic acid, hybridization with acapture probe at an address of the plurality, is detected, e.g., bysignal generated from a label attached to the nucleic acid, polypeptide,or antibody.

In another aspect, the invention features, a method of analyzing 26651or 26138, e.g., analyzing structure, function, or relatedness to othernucleic acid or amino acid sequences. The method includes: providing a26651 or 26138 nucleic acid or amino acid sequence; comparing the 26651or 26138 sequence with one or more preferably a plurality of sequencesfrom a collection of sequences, e.g., a nucleic acid or protein sequencedatabase; to thereby analyze 26651 or 26138.

Preferred databases include GenBank™. The method can include evaluatingthe sequence identity between a 26651 or 26138 sequence and a databasesequence. The method can be performed by accessing the database at asecond site, e.g., over the internet.

In another aspect, the invention features, a set of oligonucleotides,useful, e.g., for identifying SNP's, or identifying specific alleles of26651 or 26138. The set includes a plurality of oligonucleotides, eachof which has a different nucleotide at an interrogation position, e.g.,an SNP or the site of a mutation. In a preferred embodiment, theoligonucleotides of the plurality are identical in sequence with oneanother (except for differences in length). The oligonucleotides can beprovided with different labels, such that an oligonucleotide thathybridizes to one allele provides a signal that is distinguishable froman oligonucleotide which hybridizes to a second allele.

This invention is further illustrated by the following examples whichshould not be construed as limiting. The contents of all references,patents and published patent applications cited throughout thisapplication are incorporated herein by reference.

EXPERIMENTAL Example 1 Identification and Characterization of Human26651 GAP

The human 26651 GAP sequence (FIGS. 52A-52B), which is approximately2847 nucleotides long including untranslated regions, contains apredicted methionine-initiated coding sequence of about 547 amino acids(nucleotides 60-1703 of SEQ ID NO:21; SEQ ID NO:23). The coding sequenceencodes a 547 amino acid protein (SEQ ID NO:22).

PFAM analysis indicates that the 26651 polypeptide shares a high degreeof sequence similarity with GAPs. Further, PFAM analysis indicates thatthe 26651 shares a high degree of sequence similarity with the Rho-GAPsubclass. For general information regarding PFAM identifiers, PS prefixand PF prefix domain identification numbers, refer to Sonnhammer et al(1997) Protein 28:405-420 andwww.psc.edu/general/software/packages/pfam/pfam.html.

As used herein, the term “Rho-GAP domain” includes an amino acidsequence of about 80-300 amino acid residues in length and having a bitscore for the alignment of the sequence to the Rho-GAP domain (HMM) ofat least 8. Preferably, a Rho-GAP domain includes at least about 100-250amino acids, more preferably about 120-200 amino acid residues, or about120-180 amino acids and has a bit score for the alignment of thesequence to the Rho-GAP domain (HMM) of at least 16 or greater. TheRho-GAP domain (HMM) has been assigned the PFAM Accession PF00620(pfam.wustl.edu/). An alignment of the Rho-GAP domain (amino acids 236to 397 of SEQ ID NO:22) of human 26651-like polypeptides with aconsensus amino acid sequence derived from a hidden Markov model (SEQ IDNO:27) is depicted in FIGS. 56A-B.

In a preferred embodiment a 26651-like polypeptide or protein has a“Rho-GAP domain” or a region which includes at least about 100-250, morepreferably about 120-200, or 120-180 amino acid residues and has atleast about 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity withan “Rho-GAP domain,” e.g., the Rho-GAP domain of human 26651 (e.g.,amino acid residues 236-397 of SEQ ID NO:22).

PFAM analysis indicates that the 26651 polypeptide shares a high degreeof sequence similarity with dockerin. The dockerin domain (HMM) has beenassigned the PFAM Accession PF00404 (pfam.wustl.edu/). The dockerindomain of 26651 falls between amino acids 278 to 298 of SEQ ID NO:22.

To identify the presence of a “Rho-GAP domain” in a 26651-like proteinsequence, and make the determination that a polypeptide or protein ofinterest has a particular profile, the amino acid sequence of theprotein can be searched against a database of HMMs (e.g., the Pfamdatabase, release 2.1) using the default parameters(www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsfprogram, which is available as part of the HMMER package of searchprograms, is a family specific default program for MILPAT0063 and ascore of 15 is the default threshold score for determining a hit.Alternatively, the threshold score for determining a hit can be lowered(e.g., to 8 bits). A description of the Pfam database can be found inSonhammer et al. (1997) Proteins 28(3):405-420 and a detaileddescription of HMMs can be found, for example, in Gribskov et al. (1990)Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad.Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531;and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of whichare incorporated herein by reference.

ProDom analysis of 26651 revealed 35% identity to a protein GTPasedomain (p99.2 (80) P85A(4) P85B(4)CHIN(2)//PROTEIN GTPASE DOMAIN SH2ACTIVATION ZINC 3-KINASE SH3 PHOSPHATIDYLINOSITOL REGULATORY). ProDomanalysis further revealed regions having 27% and 37% identity with thelong isoform of RhoGapX-1 (P99.2 (2) 043182(1) 043437(1)//RHO-TYPEGTPASE ACTIVATING PROTEIN RHOGAPX-1 LONG ISOFORM). A region having 28%identity to a trithorax transcription regulation protein (p99.2 (3)Q24742(1) Q27255(1) TRX(1)//TRITHORAX PROTEIN PREDICTED TRXTRANSCRIPTION REGULATION ZINC-FINGER METAL-BINDING DNA-BINDING NUCLEAR)was identified by ProDom analysis. The ProDom analysis also identifiedregions of 26651 with 29%, 26%, and 23% identity to the T-DNA region ofa TI plasmid (p99.2 (1) Q44390-AGRTU//T1 PLASMID PT11 5955 T-DNA REGIONPLASMID), cosmid (p99.2 (1) Q20299_CAELL//COSMID F41H10), and ahypothetical protein (p99.2 (1) O26888-METTH//HYPOTHETICAL 21.6 KDPROTEIN HYPOTHETICAL PROTEIN), respectively.

Example 2 Identification and Characterization of Human 26138 GAP

The human 26138 GAP sequence (FIGS. 57A-57C), which is approximately3391 nucleotides long including untranslated regions, contains apredicted methionine-initiated coding sequence of about 3018 nucleotides(nucleotides 78-3095 of SEQ ID NO:24: SEQ ID NO:26). The coding sequenceencodes a 1005 amino acid protein (SEQ ID NO:25).

PFAM analysis indicates that the 26138 polypeptide shares a high degreeof sequence similarity with GAPs. Further, PFAM analysis indicates thatthe 26138 polypeptide shares a high degree of sequence similarity withthe Ras-GAP subclass. For general information regarding PFAMidentifiers, PS prefix and PF prefix domain identification numbers,refer to Sonnhammer et al. (1997) Protein 28:405-420 andwww.psc.edu/general/software/packages/pfam/pfam.html.

As used herein, the term “Ras-GAP domain” includes an amino acidsequence of about 80-300 amino acid residues in length and having a bitscore for the alignment of the sequence to the Ras-GAP domain (HMM) ofat least 8. Preferably, a Ras-GAP domain includes at least about 100-250amino acids, more preferably about 130-200 amino acid residues, or about160-200 amino acids and has a bit score for the alignment of thesequence to the Ras-GAP domain (HMM) of at least 16 or greater. Ras-GAPdomain (HMM) has been assigned the PFAM Accession PF 00616(pfam.wustl.edu/). An alignment of the Ras-GAP domain (amino acids 473to 645 of SEQ ID NO:25) of human 26138 with a consensus amino acidsequence derived from a hidden Markov model (SEQ ID NO:29) is depictedin FIGS. 61A-61B.

In a preferred embodiment a 26138-like polypeptide or protein has a“Ras-GAP domain” or a region which includes at least about 100-250 morepreferably about 130-200 or 160-200 amino acid residues and has at leastabout 60%, 70%, 80%, 90%, 95%, 99%, or 100% sequence identity with an“Ras-GAP domain,” e.g., the Ras-GAP domain of human 26138 (e.g., aminoacid residues 473 to 645 of SEQ ID NO:25).

PFAM analysis indicates that the 26138 polypeptide shares a high degreeof sequence similarity with the gntR family of bacterial regulatoryproteins. The gntR domain (HMM) has been assigned the PFAM AccessionNumber PF00392 (pfam.wustl.edu/). The gntR domain of 26138 falls betweenamino acids 405 to 433 of SEQ ID NO:25. PFAM analysis indicates that the26138 polypeptide shares a region (amino acids 253 to 287 of SEQ IDNO:25) with similarity to the pleckstrin homology domain. The pleckstrinhomology domain has been assigned the PFAM Accession Number PF00169.

To identify the presence of a “Ras-GAP domain” in a 26138-like proteinsequence, and make the determination that a polypeptide or protein ofinterest has a particular profile, the amino acid sequence of theprotein can be searched against a database of HMMs (e.g., the Pfamdatabase, release 2.1) using the default parameters(www.sanger.ac.uk/Software/Pfam/HMM_search). For example, the hmmsfprogram, which is available as part of the HMMER package of searchprograms, is a family specific default program for MILPAT0063 and ascore of 15 is the default threshold score for determining a hit.Alternatively, the threshold score for determining a hit can be lowered(e.g., to 8 bits). A description of the Pfam database can be found inSonhammer et al. (1997) Proteins 28(3):405-420 and a detaileddescription of HMMs can be found, for example, in Gribskov et al (1990)Meth. Enzymol. 183:146-159; Gribskov et al. (1987) Proc. Natl. Acad.Sci. USA 84:4355-4358; Krogh et al. (1994) J. Mol. Biol. 235:1501-1531;and Stultz et al. (1993) Protein Sci. 2:305-314, the contents of whichare incorporated herein by reference.

ProDom analysis of 2613 revealed 25% identity to p99.2 (1)O44242_CAEEL//GAP 2-4. Further analysis revealed 38% identity to aras-GAP inhibitory regulator (p99.2 (29) GTPA(3) NF1(2) GAP1(2)//PROTEIN GTPASE ACTIVATION GTPASE-ACTIVATING RAS NEUROFIBROMIN P21ACTIVATOR INHIBITORY REGULATOR). A region having 31% identity to afilament intermediate repeat heptad (p99.2 (314) LAMA (10) DESM (8)LAM1(8)//FILAMENT INTERMEDIATE REPEAT HEPTAD PATTERN COILED COIL KERATINPROTEIN TYPE) was identified by ProDom analysis. 26138 is 26%, 30%, and26% identical to several hypothetical proteins: p99.2 (1)YWKC_BACSU//HYPOTHETICAL 21.1 KD PROTEIN IN TDK-PRFA INTERGENIC REGION;p99.2 (1) YBYO_YEAST//HYPOTHETICAL 47.4 KD PROTEIN IN OPY1-AGP2INTERGENIC REGION; and p99.2 (1) YFHG_ECOLI//HYPOTHETICAL 27.3 KDPROTEIN IN GLNB-PURL INTERGENIC REGION ORF-1 F239, respectively. 26138also shares 23% identity to GAG GAG-POL polypeptides (p99.2 (2) Q88284(1) Q88285(1)//POLYPROTEIN GAG GAG-POL).

Example 3 Tissue Distribution of 26138 mRNA

Expression levels of 26138 in various tissue and cell types weredetermined by quantitative RT-PCR (Reverse Transcriptase PolymeraseChain Reaction; Taqman® brand PCR kit, Applied Biosystems). Thequantitative RT-PCR reactions were performed according to the kitmanufacturer's instructions. The results of the Taqman® analysis areshown in FIGS. 64A-64B.

26138 was expressed in a variety of human tissues including tonsil,spleen, fetal liver, adult liver, fibrotic liver, granulocytes,neutrophils, erythroid cells, adipose tissue, bone marrow, colon, lung,kidney, heart, lymphocyte, megakaryocytes and T-cells.

Example 4 Tissue Distribution of 26651 or 26138 mRNA

Northern blot hybridizations with various RNA samples are performedunder standard conditions and washed under stringent conditions, i.e.,0.2×SSC at 65° C. A DNA probe corresponding to all or a portion of the26651 or 26138 cDNA (SEQ ID NO:21 or 24) can be used. The DNA isradioactively labeled with ³²P-dCTP using the Prime-It Kit (Stratagene,La Jolla, Calif.) according to the instructions of the supplier. Filterscontaining mRNA from mouse hematopoietic and endocrine tissues, andcancer cell lines (Clontech, Palo Alto, Calif.) are probed in ExpressHybhybridization solution (Clontech) and washed at high stringencyaccording to manufacturer's recommendations.

Example 5 Recombinant Expression of 26651 or 26138 in Bacterial Cells

In this example, 26651 or 26138 is expressed as a recombinantglutathione-S-transferase (GST) fusion polypeptide in E. coli and thefusion polypeptide is isolated and characterized. Specifically, 26651 or26138 is fused to GST and this fusion polypeptide is expressed in E.coli, e.g., strain PEB199. Expression of the GST-26651 or 26138 fusionprotein in PEB199 is induced with FPTG. The recombinant fusionpolypeptide is purified from crude bacterial lysates of the inducedPEB199 strain by affinity chromatography on glutathione beads. Usingpolyacrylamide gel electrophoretic analysis of the polypeptide purifiedfrom the bacterial lysates, the molecular weight of the resultant fusionpolypeptide is determined.

Example 6 Expression of Recombinant 26651 or 26138 Protein in COS Cells

To express the 26651 or 26138 gene in COS cells, the pCDNA/Amp vector byInvitrogen Corporation (San Diego, Calif.) is used. This vector containsan SV40 origin of replication, an ampicillin resistance gene, an E. colireplication origin, a CMV promoter followed by a polylinker region, andan SV40 intron and polyadenylation site. A DNA fragment encoding theentire 26651 or 26138 protein and an HA tag (Wilson et al. (1984) Cell37:767) or a FLAG tag fused in-frame to its 3′ end of the fragment iscloned into the polylinker region of the vector, thereby placing theexpression of the recombinant protein under the control of the CMVpromoter.

To construct the plasmid, the 26651 or 26138 DNA sequence is amplifiedby PCR using two primers. The 5′ primer contains the restriction site ofinterest followed by approximately twenty nucleotides of the 26651 or26138 coding sequence starting from the initiation codon; the 3′ endsequence contains complementary sequences to the other restriction siteof interest, a translation stop codon, the HA tag or FLAG tag and thelast 20 nucleotides of the 26651 or 26138 coding sequence. The PCRamplified fragment and the pCDNA/Amp vector are digested with theappropriate restriction enzymes and the vector is dephosphorylated usingthe CIAP enzyme (New England Biolabs, Beverly, Mass.). Preferably thetwo restriction sites chosen are different so that the 26651 or 26138gene is inserted in the correct orientation. The ligation mixture istransformed into E. coli cells (strains HB101, DH5α, SURE, availablefrom Stratagene Cloning Systems, La Jolla, Calif., can be used), thetransformed culture is plated on ampicillin media plates, and resistantcolonies are selected. Plasmid DNA is isolated from transformants andexamined by restriction analysis for the presence of the correctfragment.

COS cells are subsequently transfected with the 26651 or 26138-pCDNA/Ampplasmid DNA using the calcium phosphate or calcium chlorideco-precipitation methods, DEAE-dextran-mediated transfection,lipofection, or electroporation. Other suitable methods for transfectinghost cells can be found in Sambrook, J., Fritsh, E. F., and Maniatis, T.Molecular Cloning: A Laboratory Manual. 2nd, ed., Cold Spring HarborLaboratory, Cold Spring Harbor Laboratory Press, Cold Spring Harbor,N.Y., 1989. The expression of the 26651 or 26138 polypeptide is detectedby radiolabelling (³⁵S-methionine or ³⁵S-cysteine available from NEN,Boston, Mass., can be used) and immunoprecipitation (Harlow, E. andLane, D. Antibodies: A Laboratory Manual, Cold Spring Harbor LaboratoryPress, Cold Spring Harbor, N.Y., 1988) using an HA specific monoclonalantibody. Briefly, the cells are labeled for 8 hours with ³⁵S-methionine(or ³⁵S-cysteine). The culture media are then collected and the cellsare lysed using detergents (RIPA buffer, 150 mM NaCl, 1% NP-40, 0.1%SDS, 0.5% DOC, 50 mM Tris, pH 7.5). Both the cell lysate and the culturemedia are precipitated with an HA specific monoclonal antibody.Precipitated polypeptides are then analyzed by SDS-PAGE.

Alternatively, DNA containing the 26651 or 26138 coding sequence iscloned directly into the polylinker of the pCDNA/Amp vector using theappropriate restriction sites. The resulting plasmid is transfected intoCOS cells in the manner described above, and the expression of the 26651or 26138 polypeptide is detected by radiolabelling andimmunoprecipitation using a 26651 or 26138 specific monoclonal antibody.

This invention may be embodied in many different forms and should not beconstrued as limited to the embodiments set forth herein; rather, theseembodiments are provided so that this disclosure will fully convey theinvention to those skilled in the art. Many modifications and otherembodiments of the invention will come to mind in one skilled in the artto which this invention pertains having the benefit of the teachingspresented in the foregoing description. Although specific terms areemployed, they are used as in the art unless otherwise indicated.

1. An isolated nucleic acid molecule selected from the group consistingof: a) a nucleic acid molecule comprising a nucleotide sequence that isat least 60% identical to the nucleotide sequence set forth in SEQ IDNO:1, 6, 8, 10, 12, 14, 16, 18, 20, 21, 23, 24, or 26, or the nucleotidesequence of the cDNA insert of the plasmid deposited with ATCC as PatentDeposit Number PTA-1850, PTA-2170, PTA-2812, PTA-2171, PTA-2813,PTA-2011, PTA-2012, PTA-2341, PTA-1849, PTA-1915, PTA-1871, or PTA-1918,wherein said nucleotide sequence encodes a polypeptide having biologicalactivity; b) a nucleic acid molecule comprising a fragment of at least20 contiguous nucleotides of the nucleotide sequence set forth in SEQ IDNO:1, 6, 8, 10, 12, 14, 16, 18, 20, 21, 23, 24, or 26, or the nucleotidesequence of the cDNA insert of the plasmid deposited with ATCC as PatentDeposit Number PTA-1850, PTA-2170, PTA-2812, PTA-2171, PTA-2813,PTA-2011, PTA-2012, PTA-2341, PTA-1849, PTA-1915, PTA-1871, or PTA-1918;c) a nucleic acid molecule encoding a polypeptide comprising the aminoacid sequence set forth in SEQ ID NO:2, 5, 7, 9, 11, 13, 15, 17, 19, 22,or 25 or the amino acid sequence encoded by the nucleotide sequence ofthe cDNA insert of the plasmid deposited with ATCC as Patent DepositNumber PTA-1850, PTA-2170, PTA-2812, PTA-2171, PTA-2813, PTA-2011,PTA-2012, PTA-2341, PTA-1849, PTA-1915, PTA-1871, or PTA-1918; d) anucleic acid molecule which encodes a fragment of a polypeptidecomprising the amino acid sequence set forth in SEQ ID NO:2, 5, 7, 9,11, 13, 15, 17, 19, 22, or 25, or the amino acid sequence encoded by thenucleotide sequence of the cDNA insert of the plasmid deposited withATCC as Patent Deposit Number PTA-1850, PTA-2170, PTA-2812, PTA-2171,PTA-2813, PTA-2011, PTA-2012, PTA-2341, PTA-1849, PTA-1915, PTA-1871, orPTA-1918, wherein the fragment comprises at least 15 contiguous aminoacids of SEQ ID NO:2, 5, 7, 9, 11, 13, 15, 17, 19, 22, or 25 or theamino acid sequence encoded by the nucleotide sequence of the cDNAinsert of the plasmid deposited with ATCC as Patent Deposit NumberPTA-1850, PTA-2170, PTA-2812, PTA-2171, PTA-2813, PTA-2011, PTA-2012,PTA-2341, PTA-1849, PTA-1915, PTA-1871, or PTA-1918; e) a nucleic acidmolecule encoding a biologically active variant of the amino acidsequence set forth in SEQ ID NO:2, 5, 7, 9, 11, 13, 15, 17, 19, 22, or25 or the amino acid sequence encoded by the nucleotide sequence of thecDNA insert of the plasmid deposited with ATCC as Patent Deposit NumberPTA-1850, PTA-2170, PTA-2812, PTA-2171, PTA-2813, PTA-2011, PTA-2012,PTA-2341, PTA-1849, PTA-1915, or PTA-1871, or PTA-1918, wherein thenucleic acid molecule hybridizes the complement of the nucleotidesequence set forth in SEQ ID NO:1, 6, 8, 10, 12, 14, 16, 18, 20, 21, 23,24, or 26, or the nucleotide sequence of the cDNA insert of the plasmiddeposited with ATCC as Patent Deposit Number PTA-1850, PTA-2170,PTA-2812, PTA-2171, PTA-2813, PTA-2011, PTA-2012, PTA-2341, PTA-1849,PTA-1915, PTA-1871, or PTA-1918 under stringent conditions; and f) anucleic acid molecule comprising the complement of the nucleic acidmolecule of a), b), c), d), or e).
 2. The isolated nucleic acid moleculeof claim 1, wherein said nucleic acid molecule is selected from thegroup consisting of: a) a nucleic acid molecule comprising thenucleotide sequence set forth in SEQ ID NO:1, 6, 8, 10, 12, 14, 16, 18,20, 21, 23, 24, or 26, or a complement thereof; b) a nucleic acidmolecule comprising the nucleotide sequence of the cDNA insert of theplasmid deposited with ATCC as Patent Deposit Number PTA-1850, PTA-2170,PTA-2812, PTA-2171, PTA-2813, PTA-2011, PTA-2012, PTA-2341, PTA-1849,PTA-1915, or PTA-1871; c) a nucleic acid molecule encoding a polypeptidecomprising the amino acid sequence of SEQ ID NO:2, 5, 7, 9, 11, 13, 15,17, 19, 22, or 25, or a complement thereof; and d) a nucleic acidmolecule encoding the polypeptide encoded by the nucleotide sequence ofthe cDNA insert of the plasmid deposited with ATCC as Patent DepositNumber PTA-1850, PTA-2170, PTA-2812, PTA-2171, PTA-2813, PTA-2011,PTA-2012, PTA-2341, PTA-1849, PTA-1915, or PTA-1871.
 3. The nucleic acidmolecule of claim 1, further comprising vector nucleic acid sequences.4. The nucleic acid molecule of claim 1 further comprising nucleic acidsequences encoding a heterologous polypeptide.
 5. A host cell thatcontains the nucleic acid molecule of claim
 3. 6. An isolatedpolypeptide selected from the group consisting of: a) a biologicallyactive polypeptide which is encoded by a nucleic acid moleculecomprising a nucleotide sequence which is at least 60% identical to anucleic acid comprising the nucleotide sequence set forth in SEQ IDNO:1, 6, 8, 10, 12, 14, 16, 18, 20, 21, 23, 24, or 26, or the nucleotidesequence of the cDNA insert of the plasmid deposited with ATCC as PatentDeposit Number PTA-1850, PTA-2170, PTA-2812, PTA-2171, PTA-2813,PTA-2011, PTA-2012, PTA-2341, PTA-1849, PTA-1915, PTA-1871, or PTA-1918;b) a naturally occurring allelic variant of a polypeptide comprising theamino acid sequence set forth in SEQ ID NO:2, 5, 7, 9, 11, 13, 15, 17,19, 22, or 25, or the amino acid sequence encoded by the nucleotidesequence of the cDNA insert of the plasmid deposited with ATCC as PatentDeposit Number PTA-1850, PTA-2170, PTA-2812, PTA-2171, PTA-2813,PTA-2011, PTA-2012, PTA-2341, PTA-1849, PTA-1915, PTA-1871, or PTA-1918,wherein the polypeptide is encoded by a nucleic acid molecule whichhybridizes to a nucleic acid molecule comprising the complement of SEQID NO:1, 6, 8, 10, 12, 14, 16, 18, 20, 21, 23, 24, or 26, or thenucleotide sequence of the cDNA insert of the plasmid deposited withATCC as Patent Deposit Number PTA-1850, PTA-2170, PTA-2812, PTA-2171,PTA-2813, PTA-2011, PTA-2012, PTA-2341, PTA-1849, PTA-1915, PTA-1871, orPTA-1918 under stringent conditions; and, c) a fragment of a polypeptidecomprising the amino acid sequence set forth in SEQ ID NO:2, 5, 7, 9,11, 13, 15, 17, 19, 22, or 25, or the amino acid sequence encoded by thenucleotide sequence of the cDNA insert of the plasmid deposited withATCC as Patent Deposit Number PTA-1850, PTA-2170, PTA-2812, PTA-2171,PTA-2813, PTA-2011, PTA-2012, PTA-2341, PTA-1849, PTA-1915, PTA-1871, orPTA-1918, wherein the fragment comprises at least 15 contiguous aminoacids of the amino acid sequence set forth in SEQ ID NO:2, 5, 7, 9, 11,13, 15, 17, 19, 22, or 25, or the amino acid sequence encoded by thenucleotide sequence of the cDNA insert of the plasmid deposited withATCC as Patent Deposit Number PTA-1850, PTA-2170, PTA-2812, PTA-2171,PTA-2813, PTA-2011, PTA-2012, PTA-2341, PTA-1849, PTA-1915, PTA-1871, orPTA-1918; and d) a polypeptide having at least 60% sequence identity tothe amino acid sequence SEQ ID NO:2, 5, 7, 9, 11, 13, 15, 17, 19, 22, or25, wherein the polypeptide has biological activity.
 7. The isolatedpolypeptide of claim 6 comprising the amino acid sequence of SEQ IDNO:2, 5, 7, 9, 11, 13, 15, 17, 19, 22, or
 25. 8. The polypeptide ofclaim 6 further comprising heterologous amino acid sequences.
 9. Anantibody which selectively binds to a polypeptide of claim
 6. 10. Amethod for producing a polypeptide selected from the group consistingof: a) a polypeptide comprising the amino acid sequence set forth in SEQID NO:2, 5, 7, 9, 11, 13, 15, 17, 19, 22, or 25, or the amino acidsequence encoded by the nucleotide sequence of the cDNA insert of theplasmid deposited with ATCC as Patent Deposit Number PTA-1850, PTA-2170,PTA-2812, PTA-2171, PTA-2813, PTA-2011, PTA-2012, PTA-2341, PTA-1849,PTA-1915, PTA-1871, or PTA-1918; b) a polypeptide comprising a fragmentof the amino acid sequence set forth in SEQ ID NO:2, 5, 7, 9, 11, 13,15, 17, 19, 22, or 25, or the amino acid sequence encoded by thenucleotide sequence of the cDNA insert of the plasmid deposited withATCC as Patent Deposit Number PTA-1850, PTA-2170, PTA-2812, PTA-2171,PTA-2813, PTA-2011, PTA-2012, PTA-2341, PTA-1849, PTA-1915, PTA-1871, orPTA-1918, wherein the fragment comprises at least 15 contiguous aminoacids of SEQ ID NO:2, 5, 7, 9, 11, 13, 15, 17, 19, 22, or 25, or theamino acid sequence encoded by the nucleotide sequence of the cDNAinsert of the plasmid deposited with ATCC as Patent Deposit NumberPTA-1850, PTA-2170, PTA-2812, PTA-2171, PTA-2813, PTA-2011, PTA-2012,PTA-2341, PTA-1849, PTA-1915, PTA-1871, or PTA-1918; c) a biologicallyactive variant of a polypeptide comprising the amino acid sequence setforth in SEQ ID NO:2, 5, 7, 9, 11, 13, 15, 17, 19, 22, or 25, or theamino acid sequence encoded by the nucleotide sequence of the cDNAinsert of the plasmid deposited with ATCC as Patent Deposit NumberPTA-1850, PTA-2170, PTA-2812, PTA-2171, PTA-2813, PTA-2011, PTA-2012,PTA-2341, PTA-1849, PTA-1915, PTA-1871, or PTA-1918, wherein thepolypeptide is encoded by a nucleic acid molecule which hybridizes to anucleic acid molecule comprising the complement of SEQ ID NO:1, 6, 8,10, 12, 14, 16, 18, 20, 21, 23, 24, or 26, or the nucleotide sequence ofthe cDNA insert of the plasmid deposited with ATCC as Patent DepositNumber PTA-1850, PTA-2170, PTA-2812, PTA-2171, PTA-2813, PTA-2011,PTA-2012, PTA-2341, PTA-1849, PTA-1915, PTA-1871, or PTA-1918; and d) apolypeptide having at least 60% sequence identity to the amino acidsequence of SEQ ID NO:2, 5, 7, 9, 11, 13, 15, 17, 19, 22, or 25, whereinsaid polypeptide has biological activity; comprising culturing the hostcell of claim 5 under conditions in which the nucleic acid molecule isexpressed.
 11. A method for detecting the presence of a polypeptide ofclaim 6 in a sample, comprising: a) contacting the sample with acompound which selectively binds to a polypeptide of claim 6; and b)determining whether the compound binds to the polypeptide in the sample.12. The method of claim 11, wherein the compound which binds to thepolypeptide is an antibody.
 13. A kit comprising a compound whichselectively binds to a polypeptide of claim 6 and instructions for use.14. A method for detecting the presence of a nucleic acid molecule ofclaim 1 in a sample, comprising the steps of: a) contacting the samplewith a nucleic acid probe or primer which selectively hybridizes to thenucleic acid molecule; and b) determining whether the nucleic acid probeor primer binds to a nucleic acid molecule in the sample.
 15. The methodof claim 14, wherein the sample comprises mRNA molecules and iscontacted with a nucleic acid probe.
 16. A kit comprising a compoundwhich selectively hybridizes to a nucleic acid molecule of claim 1 andinstructions for use.
 17. A method for identifying a compound that bindsto a polypeptide of claim 6 comprising the steps of: a) contacting apolypeptide, or a cell expressing a polypeptide of claim 6 with a testcompound; and b) determining whether the polypeptide binds to the testcompound.
 18. The method of claim 17, wherein the binding of the testcompound to the polypeptide is detected by a method selected from thegroup consisting of: a) detection of binding by direct detecting of testcompound/polypeptide binding; b) detection of binding using acompetition binding assay; c) detection of binding using an assay forGAP mediated nucleotide exchange.
 19. A method for modulating theactivity of a polypeptide of claim 6 comprising contacting a polypeptideor a cell expressing a polypeptide of claim 6 with a compound whichbinds to the polypeptide in a sufficient concentration to modulate theactivity of the polypeptide.
 20. A method for identifying a compoundwhich modulates the activity of a polypeptide of claim 6, comprising: a)contacting a polypeptide of claim 6 with a test compound; and b)determining the effect of the test compound on the activity of thepolypeptide to thereby identify a compound that modulates the activityof the polypeptide.